Subject Re: UTF-8 vs UTF-16
Author peter_jacobi.rm
Hi Nickolay,

> > 3) prohibiting invalid UTF-8 sequences in UNICODE_FSS cols
> [...] Point 3 us also possibly implemented, but not always enforced
> (at least INTL API have enough functionality).

My copy of the "InterBase Collation Kit" doc from David
Schnepper states:

<cite>
charset_well_formed
Not used. This was intended to be a pointer to a function that would
validate that a string was well formed by the rules of a character set

In his examples this always NULL. Do you mean that it in fact
called and can usefully defined by fbintl*.dll?
</cite>

> BTW, don't you remember that Firebird already
> implements UCS2 charset under name UNICODE in standard
> fbintl.dll ? It should already have all problems
> including efficient on-page data compression solved.

Sorry, I'm less than four months looking at FB. From
my incomplete knowledge I assumed, it is not meant for
use a database storage charset, as
- it is not made accessable by gdb$character_sets
- there is somewhere a warning about sensitivity
to endianness.

Regards,
Peter