Subject Re: [Firebird-Architect] Re: [firebird-support] Writing UTF16 to the database
Author Dimitry Sibiryakov
On 28 Feb 2005 at 10:41, Olivier Mascia wrote:

>> If anyone can name a language that (s)he would like to store in FB
>> but it is not presented in UCS-2, we can consider using UCS-4.
>
>That would be a dumb choice (to opt for UCS-4). 32 bits per character
>is too much than needed. Unicode needs more than 16 bits, but less
>than 24 bits.

So, you call Linux designers "dumb"? ;-)

>16 bits per character is not enough unless you actually implement
>UTF-16 correctly, which *is* a variable length encoding. UTF-16 can
>use more than one 16 bit word to represent a single character. This
>offset the advantage of using a pure 16 bits encoding. What's more,
>UTF-8 has other big advantages over all other solutions.

That's right, UTF-8 has it's advantages. And it can be considered
as an encoding for network transfer. But using constant-length
encodings for storage and processing is more convenient, IMHO.
Address arifmetic is faster than scan, isn't it?

>(covers the whole unicode standard), and is the closest to the concept
>of a stream of bytes terminated by a zero (C-string). The fact that a
>pure ASCII (7 bits) string in UTF-8 is exactly the same binary as
>those same bytes in ASCII is also a nice facility.

But as you mentioned in another letter, ASCII is used by less than
50% on people on Earth.
--
SY, Dimitry Sibiryakov.