Subject Re: [Firebird-Architect] Re: UTF-8 vs UTF-16
Author Olivier Mascia
Hello Dimitry,

Monday, August 25, 2003, 8:56:32 AM, you wrote:

DS> On 24 Aug 2003 at 11:48, David Schnepper wrote:

>>I think, for Firebird, that UNICODE-16 support should be put in, and
>>if people want to store the supplimental characters into it, well,
>>it words for point c above, it should work for point b -- (as I
>>doubt there are any other character sets that encode the characters
>>other than how Unicode supplimental would) - and it doesn't work for
>>point a.

DS> I think that Firebird engine could (and should) become completely
DS> UNICODE-16. No zoo of charsets, no multibyte encodings with a
DS> sword of Damocles of buffer overflows. Charsets have a part on
DS> client side only. And AFAIK only one charset is used by client
DS> side at a time. And this charset is determined by locale. Lets't
DS> put UNICODE->desired charset conversion to client side where this
DS> can be done by system calls. The only language-aware variable
DS> thing that left on server is sorting. I don't know such languages
DS> as French and Spanish and can't tell if the same characters can
DS> take different positions in sorting order. Probably even sorting
DS> can be done according to one char- position table.

Though this would imply *much* complexities in upgrading existing
databases while moving from a previous FB server to one purely
UNICODE-16 based, I second this idea (as a medium-term goal). This
looks much more convenient and appealing than the whole charset issues
of today.

Regarding french language, characters with accent should sort as the
same characters without accent, just as it is in a french dictionnary.
Don't know for spanish. But I assume this rule is mostly valid, with
maybe some exceptions that could justify some collating rules.

--
Best regards,
Olivier Mascia

La guerre, c'est une chose trop grave pour la confier a des
militaires. (Georges Clemenceau)