Subject Re: [firebird-support] Bug with character sets
Author Brad Pepers
On 19-May-09, at 1:09 PM, Ann W. Harrison wrote:

> Martijn,
>>
>> Then what does it transfer? The server know how many characters
>> there are, doesn't it?
>
> Yes.
>
>>
>> How does this work for VARCHAR?
>
> A VARCHAR is passed as two bytes of character length, plus
> that many characters. So the actual stored character length
> is passed for a VARCHAR. The buffer is fixed size and
> probably padded with spaces.

So if I select a column using UTF8 for the character set and it has
two characters each of which takes up 2 bytes, I'll be told it is two
characters long and not 4 bytes?

If this works, why couldn't CHAR work the same way? If selecting a
CHAR(2) column that uses UTF8 and it has a single byte character and a
double byte character, return back that it is two characters long.
Use an 8 byte buffer so that it is large enough for the case of two 4
byte UTF8 characters but tell me the length in characters just like
VARCHAR does.

>
>>
>> Why is the CHAR buffer padded, but it seems VARCHAR isn't?
>
> Both are padded. VARCHAR has the character length. CHAR
> does not. Perhaps the easiest solution is to cast all strings
> to VARCHAR if you're using a multi-byte (and especially a
> variable length multi-byte) character set.

That was one option I was looking at to make this work. It is a shame
though to have to work around it like this.

--
Brad