Subject RE: [Firebird-Architect] Re: [firebird-support] Writing UTF16 to the database
Author Claudio Valderrama C.
Olivier Mascia wrote:
> Despite the fact that some people national characters would require 2
> to 3 bytes and some ideographic character would extend to 4 bytes,
> UTF-8 is the single encoding, which is conceptually easy to handle,
> easy to store (despite its variable length nature), which is complete
> (covers the whole unicode standard), and is the closest to the concept
> of a stream of bytes terminated by a zero (C-string). The fact that a
> pure ASCII (7 bits) string in UTF-8 is exactly the same binary as
> those same bytes in ASCII is also a nice facility.

Do I need to read between lines that you have an abstraction layer to handle
French correctly?

If so, do you recommend any app developer to reinvent the wheel to handle
Cyrillic (Russian, Belarusian, Bulgarian, Czech, Lusatian/Sorbian what is
it, German people?, Polish, Slovak, Serbian, Ukrainian), Romance aka Latin
derivatives (Spanish, Portuguese, Catalan, French, Italian, Rumanian), West
Germanic variations (German, Dutch, English, Frisian), Sinese variations
(Burmese, Chinese, Thai, Tibetan), Altaic variations (Japanese, Korean,
Mongolian, Tungusic, Turki) and the like in a single shot? And we are
leaving Malayo-Polynesian (Malay, Indonesian, Balinese, Oceanic that in turn
comprises AFAIK Fijan, Hawaiian, Maori, Tahitian and Helen can tell me if
our Easter Island's language, Rapa Nui is Maori variation) and still alive
South American languages (like Chilean-Argentinian Mapundungun) out of the
equation.
:-)

Do you suggest that applications handle variable MBCS, Olivier?

C.