Subject Re: Collations
Author peter_jacobi.rm
Hi Ales,

I copy this message to Firebird-Architect, as it seems
to be be more on topic there. I suggest you join Firebird-Architect
to further discuss this thread. Perhaps some of the Elders will
advise us, whether we should move to the developer list, but
I would prefer Firebird-Architect.

--- In, Ales Smodis wrote:
> I should mention
> though that I'll be working under linux and will thus be able to
> generate only appropriate .so libs

Fine, I'm on Win32 and so we can test both sides of the chasm.

> Errr... Since the future seems to be unicode and Firebird does
> transliteration through unicode mappings anyway,

Yes, but not collations. The collation algorithm has to be
provided per charset. This actually makes sense, as a sortkey
implementing full UNICODE collation algorithm for the entire
UNICODE repertoire typically needs 32 bits per character, whereas
for each ISO-8859-* repertoire it should be possible to squeeze
this to 12 bit per characters. As index key length is a spare
resource in Firebird, this helps a lot.

> I went to see how
> unicode guys handle collations.
> [...]

Yes I know that link. This gives the default UNICODE
collation. All locale specific collations should be defined
in terms of changes to this collation.

> Otherwise you might try

Thank you very much! I feel rather stupid, that I didn't found
that link.

> > So give me a link about Croation sort order if you find one.
> You might want to compare it with Slovenian alphabet/sort order:

And thanks for these, too.

But this is still not enough, and 'reverse-engineering' may
still be the easiest approach for getting complete locale
specific collations. If I'm not mistaken, the pages you found
don't give the nitty-gritty details of, e.g. how to sort
the various accented characters in foreign words (Polish and
Czech differ in the relative order of á and ä, for whatever

Whereas I would guess that the Win32 collating support is inferior
to Java and GNU, it's easiest to check for me. I've put the
result online at

Perhaps you can check the ouput for the languages you know.

Peter Jacobi