Subject Re: [firebird-support] Charset for identifiers and check constraints?
Author Helen Borrie
At 03:24 PM 19/06/2008, you wrote:
>Helen Borrie wrote:
>> Not "inconsistent" from the Firebird p.o.v., though. You (and the tools)
>> must distinguish between metadata and data, specifically in DDL that
>> provides text constants in expressions defining constraints, computed
>> columns, domains, etc., and also user-defined triggers and stored
>> procedures that have been migrated without recompiling.
>But shouldn't the client send the DDL expression/query in the charset
>defined by the connection, and then FB would transliterate any parts of
>it as appropriate:
>...check ("Användare" <> 'abcåäö')...
> 1 2 3 4 5
>Parts 1, 3, and 5 would be transliterated to plain ascii.
>Part 2 would be transliterated to UTF8.
>Part 3 would be transliterated to the charset of "Användare".

Can't decode this in proportional font, but...

As I understand it, the constant (literal) string 'abcåäö' would be stored in whatever charset, in isql, if you had the SET NAMES correct for the column "Användare" and typed in 'abcåäö' in this expression, it would be transliterated to the charset of the column. The trap comes if you run this same DDL from a script saved in an editor with a non-matching charset, e.g., Notepad saving in the default ascii mode or Wordpad in wordpad's "unicode" mode (which is UTF-16, I think).

But, as I mentioned before, I don't have the setup to test the assumptions. You do, presumably, because you can type characters that I can't. The more you can tell us from actual experiment, the better we'll be able to figure out what the story is. Firebird's QA uses only the built-in tools, not third-party ones.

>Or is it actually the case that this transliteration should be made by
>the client? (Would seem rather odd...)

No, the client shouldn't be trying to do transliteration. But the client (as a whole, being application, API layer and host locale ) does need to provide the environment that ensures the server receives a well-formed string, i.e., the correct client-side character set parameter and input that, together, will cross the wire "well-formed" for correct transliteration. In the case where data has to be stored in column with a character set *other than* the default defined for the database (and set for the client), the client would need to supply also either an introducer or a cast to ensure that *that particular* transliteration occurs.

However, there's no guarantee that some client applications do not attempt to work around legacy transliteration traps in some way...if you're able to test the assumptions using isql and get results that conflict with the assumptions, then a reproducible test case in the tracker would be useful.

>Just trying to understand what's going on.

You're not alone. ;-)