Subject CHAR, NCHAR defaults (was Re: UTF-8 vs UTF-16)
Author peter_jacobi.rm
Hi mailmur, All,

--- In Firebird-Architect@yahoogroups.com, "mailmur" wrote:
> MSSQLServer does it well with char/nchar choice. It cannot do a user-
> specified charsets per column, but minority db users probably need
> such feature.
>
> In the year +2000 all international-aware dbapps should use the
> unicode charset. And then dbserver should eat it transparently.

I'll have a second try at replying to your message
and a I'm looking for my constructive side here.

Yes, more things would work out of the box, and without
reading the manual, when there is only a CHAR/NCHAR choice.

When I understand MSQL right, NCHAR is UTF16 there and CHAR
is something system dependent (current system ANSI codepage,
I assume). I'm wondering if something goes south, when you
start switching the current system ANSI codepage, or even
when different system ANSI codepages are set in a network.

But a safer variant of that may be desireable (with per
database and per column overrides still functioning
for the weirdos):

NCHAR = UTF16BE always
CHAR = ISO-8859-X with configurable x (default 1)

Both character sets would use the same configurable
<language>_<country> collation (default en_US).

The friendly Win32 installer can even configure those
settings according to "Regional Options" settings stored
in the registry.

UTF16BE support must be enhanced for this (RLE compression
UNICODE space U+0020).

CJK wouldn't have their legacy MBCS supported in easy mode,
but can choose to use the old methods or go UNICODE.

Beside the 10 latin alphabets, easy mode CHAR would support
cyrillic, greek, hebrew, arabic and thai.

Regards,
Peter Jacobi