Subject | Re: [firebird-support] Re: Using unicode versus WIN1252 (Firebird |
---|---|
Author | Douglas Tosi |
Post date | 2009-01-11T15:55:57Z |
On Sun, Jan 11, 2009 at 1:39 PM, Dmitry Yemanov
<dimitr@...> wrote:
If I understood correctly:
On disk: UTF8 does not use variable-byte characters. Every character
is 4 bytes long but the whole string is RLE-compressed.
Over-the-wire: Same as on-disk, except there is no RLE-compression.
Network overhead for UTF8 strings is, thus, the same as for Unicode
strings.
On the client: fbclient strips the trailing zeros of characters
smaller than 4 bytes and returns a proper UTF8 string to the
application.
Is that it?
--
Douglas Tosi
www.sinatica.com
<dimitr@...> wrote:
> The engine reads the RLE-compressed record and decompress it. TheThanks Dmitry and Milan.
> resulting string is represented as UTF8 in memory, inside a longer
> buffer (4-byte-chars -- in order to allow a longer UTF8 string be stored
> without reallocation). This buffer is transfered to the client side "as
> is", because API also allocates a longer buffer (4-byte-chars).
If I understood correctly:
On disk: UTF8 does not use variable-byte characters. Every character
is 4 bytes long but the whole string is RLE-compressed.
Over-the-wire: Same as on-disk, except there is no RLE-compression.
Network overhead for UTF8 strings is, thus, the same as for Unicode
strings.
On the client: fbclient strips the trailing zeros of characters
smaller than 4 bytes and returns a proper UTF8 string to the
application.
Is that it?
--
Douglas Tosi
www.sinatica.com