Subject | Re: [firebird-support] Character Sets and Collation |
---|---|
Author | Helen Borrie |
Post date | 2003-10-19T00:13:43Z |
At 12:33 PM 18/10/2003 -0400, you wrote:
them. Each set has a default collation. Many sets have more than one,
since different locales can use the charset but use different characters,
different upper/lower case pairs, different dictionary orders, etc. That's
the good news.
The bad news is that the non-default collations use up a lot of bytes in
indexes. In the case of the unicode set, you can't even get any orderings
outside of the a-z, A-Z groups.
your base entities - even if you're using a DBMS that allows huge
keys. It's a trick that generators can be very useful for - you can
allocate a range for each alphabetic character to ensure that there will be
a "slot" available when it's needed. You index the proxy column instead of
the character column it represents. There are also sorting devices that you
can automate - such as Soundexes - to make sorts and searches independent
of character sets.
With IBO, you can even "appoint" your proxy sorting columns when setting up
datasets and the components will use them intelligently to speed up
user-driven searches and reorderings.
For joins, here I would beat the drum for using surrogate keys
always. There are 101 good reasons to do this with the dedication of
religious observance...
heLen
>I was mostly concerned with ordering of records when localized characterLocalized character sets have a collation sequence associated with
>sets are used.
them. Each set has a default collation. Many sets have more than one,
since different locales can use the charset but use different characters,
different upper/lower case pairs, different dictionary orders, etc. That's
the good news.
The bad news is that the non-default collations use up a lot of bytes in
indexes. In the case of the unicode set, you can't even get any orderings
outside of the a-z, A-Z groups.
>For instance, in the following report:It's a very RDBMS-ish thing to do, to design proxy sorting columns into
>
>http://www.datatrakpos.com/WebInfo/SalesByCategory.pdf
>
>Unlike many of the reports where we try to base grouping and ordering on
>integer keys, reports like this one are grouped and ordered by the
>actual names of the products and "Categories" they belong to.....
your base entities - even if you're using a DBMS that allows huge
keys. It's a trick that generators can be very useful for - you can
allocate a range for each alphabetic character to ensure that there will be
a "slot" available when it's needed. You index the proxy column instead of
the character column it represents. There are also sorting devices that you
can automate - such as Soundexes - to make sorts and searches independent
of character sets.
With IBO, you can even "appoint" your proxy sorting columns when setting up
datasets and the components will use them intelligently to speed up
user-driven searches and reorderings.
For joins, here I would beat the drum for using surrogate keys
always. There are 101 good reasons to do this with the dedication of
religious observance...
>BTW Helen, do you have an email notification for when your books isThere will be a plume of white smoke when it happens. :-)
>ready? I'd be very much interested in reading it.
heLen