firebird-support - Re: [firebird-support] Unicode

Subject	Re: [firebird-support] Unicode
Author	Scott Morgan
Post date	2005-04-11T09:49:16Z

David Johnson wrote:

> Firebird's Unicode is not UTF-8. The FSS encoding is based on an early
>
>UTF-8 draft, intended to be friendly UNIX file systems. But it is a
>distinctly different creature from the current incarnation of UTF-8.
>
>For example, a UTF-8 character may occupy from 1 to 6 bytes. A UNICODE
>FSS character will always occupy precisely three bytes. As far as I can
>tell, Unicode FSS is not understood by very many tools at all. UTF-8,
>on the other hand, is understood by many tools. Most current generation
>*NIX operating systems are built around UTF-8 (not ASCII).
>
>

This isn't what the source code shows and isn't what the FSS encoding
is. FSS is an old encoding form that eventually evolved into UTF-8, it
is a MBCS just like UTF-8. The 3 bytes figure is just how much space is
allocated per character in a field, not how much space a character will
always take (A standard latin char, that is, one found in plain ASCII,
will only take up one byte, something like Greek characters will take up
2 or more bytes and so on). So if you declare a field of length 10
you'll have 30 bytes of space to store your data, but the string
'0123456789' will only use 10 bytes.

Scott