Subject | Re: [firebird-support] Case and Accent insensitive compares |
---|---|
Author | liviuslivius |
Post date | 2016-06-16T08:00:42Z |
Hi,
you use wrong collation
UNICODE_CI is truly case-insensitive. In a search for e.g. 'Apple', it will also find 'apple', 'APPLE' and 'aPPLe'.
UNICODE_CI_AI is accent-insensitive as well. According to this collation, 'APPEL' equals 'Appèl'.
as you can see UNICODE_CI_AI is accent-insensitive use "UNICODE_CI" instead
regards,
Karol Bieniaszewski
W dniu 2016-06-15 17:28:34 użytkownik Stefan Heymann lists@... [firebird-support] <firebird-support@yahoogroups.com> napisał:
you use wrong collation
UNICODE_CI is truly case-insensitive. In a search for e.g. 'Apple', it will also find 'apple', 'APPLE' and 'aPPLe'.
UNICODE_CI_AI is accent-insensitive as well. According to this collation, 'APPEL' equals 'Appèl'.
as you can see UNICODE_CI_AI is accent-insensitive use "UNICODE_CI" instead
regards,
Karol Bieniaszewski
W dniu 2016-06-15 17:28:34 użytkownik Stefan Heymann lists@... [firebird-support] <firebird-support@yahoogroups.com> napisał:
> I expect that an accent insensitive compare treats accented characters
> as the "same" as their un-accented counterparts because the accent
> does not change the character itself but things like pronounciation or
> stress.
>
> So in Frech, à is similar to a, é is similar to è and you use an
> accent insensitive compare to find Gérard even though your search term
> says Gerard (without the accent).
>
> However, in the German language, the letters Ö and O are two different
> characters with a completely different pronounciation (the same is
> true for A/Ä and U/Ü). As they look similar, the sorting is done so
> that they stay together, but they can _not_ be treated as accented
> versions of each other.
>
> When I use the UNICODE_CI_AI collation to compare them, Firebird
> treats them as the same:
>
> select case when 'a' = 'ä' collate unicode_ci_ai then 'equal' else 'not equal' end || ' expected: not equal' from rdb$database
> union all
> select case when 'O' = 'Ö' collate unicode_ci_ai then 'equal' else 'not equal' end || ' expected: not equal' from rdb$database
> union all
> select case when 'Ä' = 'ä' collate unicode_ci then 'equal' else 'not equal' end || ' expected: equal' from rdb$database
> union all
> select case when 'a' = 'à' collate unicode_ci_ai then 'equal' else 'not equal' end || ' expected: equal' from rdb$database
> union all
> select case when 'c' = 'ç' collate unicode_ci_ai then 'equal' else 'not equal' end || ' expected: equal' from rdb$database
> union all
> select case when 'é' = 'è' collate unicode_ci_ai then 'equal' else 'not equal' end || ' expected: equal' from rdb$database
>
> delivers:
>
> equal expected: not equal
> equal expected: not equal
> equal expected: equal
> equal expected: equal
> equal expected: equal
> equal expected: equal
>
>
> Is there something that can be done to improve this?
>
>
> Regards
>
> Stefan
>
> --
> Stefan Heymann, Tübingen, Germany
>
>