Subject RE: [firebird-support] Unicode size
Author Helen Borrie
At 02:48 PM 15/12/2004 -0500, you wrote:

>Uncompressed UTF 8 - is 8 bytes.

Actually, it's 1 to 8 bytes. UNICODE_FSS is fixed at 3 bytes. If you're
interested, do try to seek out Peter Jacobi's postings on the subject of
UNICODE_FSS and how it could be made to behave more politely according to
various UTF rules.

>It sounds not like UTF8, but UTF24.
>
>:: It's enough to
>:: put anyone off using UNICODE_FSS if there is some
>:: alternative. Adriano dos
>
>Aside from using 3 bytes are there other problems? I need Unicode and
>otherwise have been pretty happy with it.

Upper-casing and ordering are problems, because the Fb implementation
of U_FSS has no collations alternative to binary. (It's the collations
that make it possible to use locale-specific subsets, or "pages" from
generalised character sets, and to apply often complex rules for mapping
and matching). Binary ordering (the default collation for all charsets)
means ordering is done strictly by code number, so orderings in any other
locale than US english will be nonsensical. U_FSS are no upper/lower case
mappings except possibly for the US ascii equivalents.

Other than that, it's just another character set - bigger than most, leaner
on options.

./heLen