Subject | Re: confused with charset and collation |
---|---|
Author | peter_jacobi.rm |
Post date | 2004-06-07T07:08:56Z |
Hi Didier!
"Didier Gasser-Morlay" <Didiergm@n...> wrote:
http://www.brookstonesystems.com/CollateKit.zip
This is almost always the lower case version. You can also
'decrement' the base string and append 'zzz':
BETWEEN 'Rhôndzzz' AND 'Rhônezzz' will match everything that
starts with 'Rhône' in all casing and accenting variations.
(Assumed that no real string in your DB has 'zzz' inside, which
would lead to some additional false positives)
http://www.unicode.org/reports/tr10/
It's the process of splitting the string comparison
in multiple phases, so that any difference in 'base character'
is considered a stronger difference than any difference
in accents, which itself is stronger than any difference
in casing. Equivalently the string to be sorted is decomposed
into these components:
Rhône => RHONE-00300-10000
made to work with FB1.5 (on Win32):
- copy (not rename) his gdsintl2.dll to fbintl2.dll
- copy (not rename) FB's fbintl.dll to gdsintl.dll
For Linux you must ask himself.
At least one brave user tried the nocase-noaccent collation
'LOADABLE' from my demo kit:
http://www.jodelpeter.de/i18n/fbarch/
With minor tweaks it should compile under Linux.
Just select the one matching your largest userbase or
customize on install.
Regards,
Peter Jacobi
"Didier Gasser-Morlay" <Didiergm@n...> wrote:
> B) see my questions inlineI'll try to clarify
> > In addition I can recommended reading the Unicode andDave's doc is included in the Collation "SDK":
> > Dave's documentation about multi level collation.
> >
> Where can I find it ? I only find a direct link to ibcollate.
http://www.brookstonesystems.com/CollateKit.zip
> > and rewrite the query to "BETWEEN "cafe" AND "cafezzz".You must get the lowest valued variation as lower bound.
> [didier] with that multi-level collation it looks like the query must
> be run in lower case isn't it ?
This is almost always the lower case version. You can also
'decrement' the base string and append 'zzz':
BETWEEN 'Rhôndzzz' AND 'Rhônezzz' will match everything that
starts with 'Rhône' in all casing and accenting variations.
(Assumed that no real string in your DB has 'zzz' inside, which
would lead to some additional false positives)
> > Both options were discussed in earlier threads.In addition to Dave's doc, look here:
> [Didier] What do you call a multi level collation, I could not find
> any ref when searching the group.
http://www.unicode.org/reports/tr10/
It's the process of splitting the string comparison
in multiple phases, so that any difference in 'base character'
is considered a stronger difference than any difference
in accents, which itself is stronger than any difference
in casing. Equivalently the string to be sorted is decomposed
into these components:
Rhône => RHONE-00300-10000
> Re the non-standard nocase noaccent, I suppose you make a ref toIt is told at the campfires, that Dave's collation can be
> dave's work at brookstonesystems. I seems that it does not work in fb
> 1.5 nor on Linux. Both are showstoppers to me as even the construction
> kit says it does not work with 1.5.
made to work with FB1.5 (on Win32):
- copy (not rename) his gdsintl2.dll to fbintl2.dll
- copy (not rename) FB's fbintl.dll to gdsintl.dll
For Linux you must ask himself.
At least one brave user tried the nocase-noaccent collation
'LOADABLE' from my demo kit:
http://www.jodelpeter.de/i18n/fbarch/
With minor tweaks it should compile under Linux.
> [didier] with my, hopefully yet, limited understanding I'd say thatThis has the widest range of culturally correct collations.
> ISO-8859-1 is enough
Just select the one matching your largest userbase or
customize on install.
Regards,
Peter Jacobi