Subject Re: [firebird-support] Character Sets and Collation
Author Helen Borrie
At 12:33 PM 18/10/2003 -0400, you wrote:

>I was mostly concerned with ordering of records when localized character
>sets are used.

Localized character sets have a collation sequence associated with
them. Each set has a default collation. Many sets have more than one,
since different locales can use the charset but use different characters,
different upper/lower case pairs, different dictionary orders, etc. That's
the good news.

The bad news is that the non-default collations use up a lot of bytes in
indexes. In the case of the unicode set, you can't even get any orderings
outside of the a-z, A-Z groups.

>For instance, in the following report:
>
>http://www.datatrakpos.com/WebInfo/SalesByCategory.pdf
>
>Unlike many of the reports where we try to base grouping and ordering on
>integer keys, reports like this one are grouped and ordered by the
>actual names of the products and "Categories" they belong to.....

It's a very RDBMS-ish thing to do, to design proxy sorting columns into
your base entities - even if you're using a DBMS that allows huge
keys. It's a trick that generators can be very useful for - you can
allocate a range for each alphabetic character to ensure that there will be
a "slot" available when it's needed. You index the proxy column instead of
the character column it represents. There are also sorting devices that you
can automate - such as Soundexes - to make sorts and searches independent
of character sets.

With IBO, you can even "appoint" your proxy sorting columns when setting up
datasets and the components will use them intelligently to speed up
user-driven searches and reorderings.

For joins, here I would beat the drum for using surrogate keys
always. There are 101 good reasons to do this with the dedication of
religious observance...

>BTW Helen, do you have an email notification for when your books is
>ready? I'd be very much interested in reading it.

There will be a plume of white smoke when it happens. :-)

heLen