Subject Re: [firebird-support] UTF8 in firebird ?
Author Lester Caine
Mark Rotteveel wrote:
>> and even you say yourself, in the true of the true standard, utf8 must
>> > be encoded
>> > in up to 6 char even !:)
> That is not what I said. UTF-8 encoding was originally devised to allow
> for encoding 2^31 - 1 characters using variable length encoding of 1 to 6
> bytes (which is afaik the entire range of unicode codepoints), but because
> UTF16 only encodes 2^16-1 characters and uses surrogate pairs for higher
> order codepoints, the decision was made by the standards committee to only
> use UTF-8 encoding upto 4 bytes, so the same range of characters as UTF16
> could be encoded to make coding between UTF16 and UTF8 easier.

Just to correct this ...
Unicode is 2^24-1
6 HEX digits
16 planes from 0x000000 to 0x10FFFF are currently defined.
So all unicode characters can be defined in 3 BYTES.

Lester Caine - G8HFL
Contact -
L.S.Caine Electronic Services -
EnquirySolve -
Model Engineers Digital Workshop -
Firebird -