Subject Re: [firebird-support] Re: Using unicode versus WIN1252 (Firebird
Author Dmitry Yemanov
Milan Babuskov wrote:
> Does this mean that when some application uses UTF8 connection character
> set, Firebird first reads UTF8 from disk, then 'expands' it to
> 4-byte-chars and later turns it back to UTF8 on the client size?

There's no such place where an UTF8 string is 'expanded' into a wide
character one (except the ICU internals). Everything is handled as UTF8
with trailing zeros (up to the buffer size).

The engine reads the RLE-compressed record and decompress it. The
resulting string is represented as UTF8 in memory, inside a longer
buffer (4-byte-chars -- in order to allow a longer UTF8 string be stored
without reallocation). This buffer is transfered to the client side "as
is", because API also allocates a longer buffer (4-byte-chars).