Subject Re: Using unicode versus WIN1252 (Firebird
Author skoczian
--- In firebird-support@yahoogroups.com, Dmitry Yemanov <dimitr@...>
wrote:
>
> Over the wire, this UTF8 string is sent "as is" for CHARs and without
> trailing zeros for VARCHARs (they're stripped at the remote protocol
level).
>
> On the client, your application allocates 4*N bytes in the result
buffer
> (this length is reported in XSQLVAR::sqllen). The fetched string is
> stored in this buffer. I.e. again, this is a "thin" UTF8 string with
> some trailing zeros.
>
> In other words, a proper UTF8 string is always being stored and
> transfered, but often it has some extra zero bytes at the tail (up to
> fit the 4*N buffer length).
>

There is one thing more: even if the database character set uses one
byte per character (ISO8859_something, ASCII, WINsomething...), but
the connection character set is UTF8, the client gets 4*N bytes for a
CHAR(N) field, padded with spaces. And that doesn't only look ugly in
a tabular representation, it can trip you up working with the data.

On the other hand, if the system character set is UTF8, another
connection character set causes problems with the non-ASCII characters.

Regards
Sibylle