Subject Re: [firebird-support] Re: Problem with macrons (unicode)...
Author David Johnson
On Thu, 2005-08-04 at 11:50 +0000, phil_hhn wrote:
> > If you choose this approach, you will have to do your localized
> > collation in the java engine, sine character set NONE allows only
> binary
> > sort order. Binary order is great for exact searches, but limits
> the
> > use of some of the other features of the engine like dictionary
> order
> > collations.
> Great, I didn't realise you could set the character set on the
> column.... I'll look into it. But your mention of binary sort order
> concerns me since we use a lot of text searches, wildcards and
> sorting, so that'll have to work still.
>
Wildcards should work the same.

Case and accent sensitivity are localization issues - Character \u00c5
(Å) is A with a ring on top to an english speaker, but its own letter in
Norwegian and Swedish (a consonant in one, and a vowel in the other).

To an english speaker, search for A% should find Ångstrom. But to the
Norwegian or Swede it should not, because the character is not just an
accented A to them.

Java's international collation support is superior to Firebird's (or
most commercial DBMS for that matter). You may need to adapt your
product's strategy to pull back initial lists, and use Java's Locale
classes for sort for display, and possibly for an initial and/or final
filtering stage.

I was working on a product whose major clientele were in countries using
arabic, chinese, and cyrillic text. I quickly came to the conclusion
that supporting all of those languages, plus latin alphabet languages,
concurrently in the same database instance, required Java's
capabilities. No DBMS (except possibly one written from the ground-up
in Java) would have the capabilities of handling those concurrently.