Subject Re: [firebird-support] Collation Choice?
Author Helen Borrie
At 12:26 AM 18/08/2004 +0000, you wrote:

>Hi All,
>
>I've just finished Helen's book which is great but I'm still
>confused about how to select a Character Set and Collation. In her
>book, she lists the sets and collations but how do you tell what the
>names mean with respect to sorting?

You can't; except as explained in Chapter 11, the collation with the name
matching the charset is always a binary collation. For other collations,
it's "pot-luck" as to what they are; though, in general, they are
locale-specific dictionary-driven collations.

>For instance, I'd like to use a set for US English with a case
>insensitive collation but I can't figure out which one to use.
>Where can I get further information about what the various sets and
>collations offer?

AFAIK, there is no case-insensitive collation for English. If there are CI
collations available for any charsets, then it gets down to a question of
using them, testing what they can do, and writing a readme about it to
assist others. If you are a whizz with languages and can set up all of the
locales on your machine, and teach your keyboard where to find characters,
I guess you could test them all. :-)

Even available documentation may not have accurate information...for
example, someone did a Hungarian collation for Fb 1.0 that he submitted as
"case-insensitive" and it was documented as such in the release
notes. Eventually, when someone actually came to use it, it turned out
that it wasn't case-insensitive at all, but handled a specialised
dictionary sort order (which is what the non-binary collations generally
do). The more complex the locale rules are, the "fatter" is the collation
and the more restrictions on the length of character indexes.

IB was built by Americans, for Americans; so UPPER supports the
upper-lower mappings for characters equivalent to those used in American
English, for any Latin charset using the default (binary)
collation. International character support was pasted on quite late in the
game.

You might like to try rolling your own. Find David Brookstone Schnepper's
link in the book to get his download kit with quite a bit of useful
documentation about how collations work. Search the archives of this list
over the past 4-5 months to find Peter Jacobi's links to his experimental
work with some charsets. You might like to monitor the current
firebird-devel threads with "INTL" in the Subject, where various people are
discussing a proposed makeover for the international language subsystem.

/heLen