Subject Re: [firebird-support] Re: Using unicode versus WIN1252 (Firebird
Author Douglas Tosi
On Sun, Jan 11, 2009 at 1:39 PM, Dmitry Yemanov
<dimitr@...> wrote:
> The engine reads the RLE-compressed record and decompress it. The
> resulting string is represented as UTF8 in memory, inside a longer
> buffer (4-byte-chars -- in order to allow a longer UTF8 string be stored
> without reallocation). This buffer is transfered to the client side "as
> is", because API also allocates a longer buffer (4-byte-chars).

Thanks Dmitry and Milan.

If I understood correctly:

On disk: UTF8 does not use variable-byte characters. Every character
is 4 bytes long but the whole string is RLE-compressed.
Over-the-wire: Same as on-disk, except there is no RLE-compression.
Network overhead for UTF8 strings is, thus, the same as for Unicode
strings.
On the client: fbclient strips the trailing zeros of characters
smaller than 4 bytes and returns a proper UTF8 string to the
application.

Is that it?

--
Douglas Tosi
www.sinatica.com