Subject | Re: [firebird-support] Character Sets and Collation |
---|---|
Author | Helen Borrie |
Post date | 2003-10-18T05:54:04Z |
At 01:02 AM 18/10/2003 -0400, you wrote:
from the same table in a mixture of languages, right?....a fairly
complicated problem. Not to mention your overseas sites, of course.
The case of "multi-lingual menus" is fairly simply solved, by having 1:1
relations from your main-language "menu" table to one table for each
individual character set that doesn't work with the main charset of the
database...or a 1:many to a single table containing a column for each
charset. The problem with the latter is that you have to change metadata
to add a new language, whereas with the former you deploy an optional extra
table. Your application code can take care of "stubbing" tables that may
be absent, so an extra language needn't break your code, as the "change
metadata" approach would.
One way to go would be to use unicode as the database default. It's the
nearest you can get to the "one size fits all" solution but it's imperfect
currently. The present UNICODE_FSS charset doesn't support
case-insensitive searching or localized sort orders and it eats a lot of
bytes in indexes.
Another way to go might be to maintain separate versions of the metadata,
with some intelligent grouping of charsets-per-locale. Actually, one could
write a whole book on the subject of database localization, as I found
recently while writing Chapter 11 of the forthcoming book. :-(
At the very least, you are looking at a fairly significant design project
to provide a generic, worldwide-capable database system, of which character
storage in the database is only one aspect.
heLen
>This brings up a question of my own however. For those who distributeHmm, in your situation, you might need to select and output dinner menus
>their applications and systems throughout the world, what are the common
>steps taken or is their a common methodology to localize FB databases in
>repect to character sets collation?
from the same table in a mixture of languages, right?....a fairly
complicated problem. Not to mention your overseas sites, of course.
The case of "multi-lingual menus" is fairly simply solved, by having 1:1
relations from your main-language "menu" table to one table for each
individual character set that doesn't work with the main charset of the
database...or a 1:many to a single table containing a column for each
charset. The problem with the latter is that you have to change metadata
to add a new language, whereas with the former you deploy an optional extra
table. Your application code can take care of "stubbing" tables that may
be absent, so an extra language needn't break your code, as the "change
metadata" approach would.
One way to go would be to use unicode as the database default. It's the
nearest you can get to the "one size fits all" solution but it's imperfect
currently. The present UNICODE_FSS charset doesn't support
case-insensitive searching or localized sort orders and it eats a lot of
bytes in indexes.
Another way to go might be to maintain separate versions of the metadata,
with some intelligent grouping of charsets-per-locale. Actually, one could
write a whole book on the subject of database localization, as I found
recently while writing Chapter 11 of the forthcoming book. :-(
At the very least, you are looking at a fairly significant design project
to provide a generic, worldwide-capable database system, of which character
storage in the database is only one aspect.
heLen