Subject Re: [firebird-support] Re: Firebird and Unicode queries
Author Lester Caine
Olivier Mascia wrote:

> What's more, UTF-32 is NOT a "four byte truncation of UTF-8".
> Absolutely NOT. Here, you're wrong Lester.

I put my hand up , and can't see the bit I was quoting from today. My
mistake was probably miss reading something last night. I was thinking 6
bytes which in my method of working gives 6 lots of 7 bits + extra byte
flag - 42/43 bits - I had not realised that UNICODE is only 32 bit ( I'm
sure there was a larger potential map at some stage but I stand corrected )
So now I need to work out why 6 bytes needs to come into the equation at
all, 21bits should still map to 4bytes with 3 'rollover' flags ;)

> The UNICODE_FSS seem to use 3 bytes, so 24 bits. So I assume that this
> thing called 'UNICODE_FFS' is just like UTF-32 where the most
> significant byte, which is always zero, is not stored. If that is the
> case, then, YES, UNICODE_FSS can store the entire Unicode code-space
> and there is a clear bi-directional full conversion possible between
> any of these 4 representations : UTF-8, UTF-16, UTF-32, UNICODE_FSS.

'seem to use' - Having had another 'quick' look at Unicode4.0.0 Spec,
can someone confirm that the fourth byte is zero, and not 'reserved for
future use'?

--
Lester Caine
-----------------------------
L.S.Caine Electronic Services