Subject Re: UTF-8 vs UTF-16
Author peter_jacobi.rm
Hi Dave, All,

In Firebird-Architect@yahoogroups.com, "David Schnepper" wrote:
[...]
> At the time (1992) I implemented Unicode-FSS (which
> was later known as FSS-UTF). There were other encoding
> proposals floating around, but I liked FSS as
> a) No embedded 0 bytes, except for real EOS.
> b) Anything that "looked like" a file system
> character ( : / . a-z, etc) really was a file system
> character.

Thank you for the historic background! I was just
curious whether there are some minor differences
between FSS and UTF-8, because one of the few links
I found, stated that originally they were competing
proposals. It seems that author is in error.

Enumerating the plusses of UTF-8 as compared with
some other multi byte encodings would add:

1. Can start at the end of the string (or even
in the middle) with decoding into characters.

2. Doing an ASCII uppercase on the UTF-8 bytes gives
an ASCII uppercase of the encoded characters.

3. You can look for substrings on the byte stream
level instead of having to use temporary wide
character strings.

Regards,
Peter Jacobi


>
> I agree that noone else picked up on the name UNICODE_FSS - I made
it up and
> noone else agreed with my wisdom. <grin>
>
> Dave