Subject Re: [firebird-support] Charset for identifiers and check constraints?
Author Helen Borrie
At 07:11 AM 19/06/2008, you wrote:

>> create table "Användare" (
>> "Lösenord" varchar(35) not null,
>> constraint "AnvändareLösenordLängd"
>> check (char_length("Lösenord") >= 8)
>> );
>>
>> It executes fine, but when I try to commit I get this error:
>>
>> ------------------------------------
>> SQL Message : -104
>> Invalid token
>>
>> Engine Code : 335544343
>> Engine Message :
>> invalid request BLR at offset 16
>> column Lösenord is not defined in table Användare
>> ------------------------------------

In Fb 2.1 - and we assume your database was created under Fb 2.1 - the identifiers are stored as UTF-8. It looks as though Flamerobin might have done the mangling, by re-transliterating the ISO8859_1 input to UNICODE_FSS, which is the charset used for identifiers prior to v.2.1.

>I'm more or less guessing, but I suspect this is a but in FB. Can anyone
>please take a look, perhaps using a different client than FlameRobin?

Why can't you test it yourself, using isql? That removes the issues of your input setup with a graphical interface operating under your conditions and also highlights those same issues for your application development, if you are developing a graphical interface. Other people can't reproduce your conditions.

You do need to do SET NAMES ISO8859_1 to set the correct charset for the client. If you get the same effect with isql in your environment then it would be worthwhile posting a Tracker report.

>I suspect that FB fails to transliterate the check constraint "body"
>before "compiling" it to BLR(?) upon commit.

Fb should transliterate the input for data and metadata appropriately, as long as it is getting the correct information about the input. So the acid-test here will be to see what isql does. You are going to bump into related problems if you attempt to run scripts that were created with an editor that stores text in a charset other than UTF8; or if you're using a database you pulled forward from an older Fb version without doing the metadata conversion.

>If that's what's happening, is it a bug or by design?

First, eliminate the "third-party" effects. If the Flamerobin editor (or any other third-party tool) is doing stuff to input that causes problems with ODS 11.1 identifiers then the sooner the problem is identified and fixed, the better. On the other hand, if you are able to identify an environmental combination that creates a loophole in the transliteration routine for DDL being processed by isql, then Adriano needs to address it.

>If by design, how should I create a check constraint on a field with non-ascii characters?

There is more than one issue here.

One is the identifiers, specifically in this case the identifiers for table.column. The obvious assumption has been that it found the table definition but not the column definition. However, the error message returns a scrambled identifier for the *table*. The more likely inference is that it can't find the (correctly reported) column definition because it can't match the relation name. So find out whether isql causes the same error. TYPE IN the definition, don't paste it from some text file.

Another is the *data* that has to be matched by the CHECK constraint. You're not (so far) getting the error message that would be expected if the engine had ascertained that the search value was incompatible with the character set defined for the column.(Error code 335544849 "Malformed string"). If you're pasting text from an external editor, or FR's editor is passing the search value in a different charset, then you still might encounter *this* problem once you have sorted out the metadata charset problem.

The whole charset issue in Fb 2.1 is massively confused, at least to me. What seems obvious to the developer (who is developing and unit-testing in an international language environment) is much less obvious to the documenter (me), who doesn't have such an environment in which to test the guesswork.

I have interrogated Adriano and made an FAQ on the issue of transliteration, which will be in the Migration and Installation document for the v.2.1.1 release (though not in the RC1 version). However, I'm acutely aware that it's only part of a comprehensive but currently non-existent solution to the confusion. It could be improved a lot with input from people who encounter and solve these problems in their divers language environments - provided we get the feedback.

./heLen