Subject Re: [firebird-support] Re: Firebird and unicode
Author Peter Jacobi
Josef Gschwendtner wrote:
> What does the statement "use of non BMP characters is problematic" mean?
> In the Unicode-glossary I read "BMP-character: A Unicode encoded character
> having a BMP code point."
>
> What kind of characters are that?
> What do I have think of as a developer?

Unicode has now settled to include about 17 * 65536 characters. The first
65536 characters are the BMP, and were considered the ones, practically
thinking people have to care about, with the other spaces available for
Klingon, Hieroglyphs and Linear B.

In the meantime I've learned, that vast amounts of new ideographic
characters, allocated outside the BMP, are in fact used by newer
Chinese stanards (like GB18030 if I got the number right) and by Chinese
Government Order new software not supporting these characters must not
be sold in China. This forced Microsoft to include support for them
in XP.

So, you see, if you are not targetting the Chinese market, you may still
ignore them.

Firebird sort-of-support for them is by surrogate pairs, which imply
that the character counts start to get wrong once you use them and that
our UTF-8 is strictly speaking CESU.

Regards,
Peter Jacobi

--
NEU FÜR ALLE - GMX MediaCenter - für Fotos, Musik, Dateien...
Fotoalbum, File Sharing, MMS, Multimedia-Gruß, GMX FotoService

Jetzt kostenlos anmelden unter http://www.gmx.net

+++ GMX - die erste Adresse für Mail, Message, More! +++