firebird-support - Re: Using unicode versus WIN1252 (Firebird

Subject	Re: Using unicode versus WIN1252 (Firebird
Author	skoczian
Post date	2009-01-11T20:13:35Z

--- In firebird-support@yahoogroups.com, Dmitry Yemanov <dimitr@...>
wrote:

>
> Over the wire, this UTF8 string is sent "as is" for CHARs and without
> trailing zeros for VARCHARs (they're stripped at the remote protocol

level).

>
> On the client, your application allocates 4*N bytes in the result

buffer

> (this length is reported in XSQLVAR::sqllen). The fetched string is
> stored in this buffer. I.e. again, this is a "thin" UTF8 string with
> some trailing zeros.
>
> In other words, a proper UTF8 string is always being stored and
> transfered, but often it has some extra zero bytes at the tail (up to
> fit the 4*N buffer length).
>

There is one thing more: even if the database character set uses one
byte per character (ISO8859_something, ASCII, WINsomething...), but
the connection character set is UTF8, the client gets 4*N bytes for a
CHAR(N) field, padded with spaces. And that doesn't only look ugly in
a tabular representation, it can trip you up working with the data.

On the other hand, if the system character set is UTF8, another
connection character set causes problems with the non-ASCII characters.

Regards
Sibylle