Subject Re: [firebird-support] Bug with character sets
Author Brad Pepers
On 20-May-09, at 1:40 AM, Dimitry Sibiryakov wrote:

>>> This last step would probably require a lot of "knowledge" in
>>> fbclient
>>> of all possible character encodings, which I expect would bloat that
>>> dll
>>> a bit more than anyone would like. Or would it?
>>
>> Or just have a character length field in the data you receive and
>> have
>> the server calculate this using the character set information it must
>> already have so on the client side I just get data I can use without
>> jumping through hoops!
>
> But you won't get the data anyway. I don't think that replacing some
> space characters with zero characters in data buffer will somehow
> help you.
> I can't understand why you are so determined to use CHAR when you
> obviously need VARCHAR. They are different datatypes with different
> purposes. You must not bend one to serve for purposes of other.

All I want to do is store 1 character in a column. It will always be
one character and never 0 so char(1) is a better match than varchar(1)
but I'm also trying to use UTF8 everywhere and this is resulting in
getting back a buffer with a length of 4 for the char(1) which is
messing things up. I could switch to varchar(1) and just avoid using
"char" types at all but I wanted to understand why Firebird is sending
me 4 characters for a char(1) column and what it expects me to do
about it. It seems like the end result is that my client side code
has to be aware of the bytes per character of all the character sets
the Firebird server supports and with this I can make "char" columns
work on the client side. Ugly but workable.

>
>> Actually thinking now I'm not even sure I understand exactly what
>> kind
>> of data I'll get back in different situations.
>
> You will get all character data in character set of connection
> unless
> this character set is NONE.
> If you are connected with character set NONE, you'll get data in
> storage character set.
> In any case field in SQLVAR will indicate this charset.

Well that is good to hear! At least if I use UTF8 for the connection
character class I will only have to deal with UTF8 data and never any
other character sets. As a matter of fact if I know I'm always using
UTF8 then wouldn't every char result I get back always have a buffer
size of 4 times the character length and I could just use this as a
fact?

>
>> Is there a document on all this that I've failed to find?
>
> Yes - Release Notes. Part "sqlsubtype and Attachment Character Set".

Thanks!

--
Brad