Subject Re: [Firebird-Architect] Re: UTF-8 (various)
Author Ivan Prenosil
"Jim Starkey" wrote:
> Good. Let's accept the idea that there are different flavors of case
> insensitivity. I think we can also agree that a collation is required
> to define case insensitivity for a particular language. The question is
> whether case insensitivity is, in fact, just a collation, or something a
> little stronger. The difference, I think, is in the definition of the
> equality operator. For field declared as case insensitive, two values
> can be different but still equal. Furthermore, a comparison of field
> that is case insensitive to a case sensitive value must be performed
> with case insensitivity. Should the same rules apply to simple
> collations? It seems funny that a client can request a case insensitive
> order of values but not a case insensitive retrieval by value.

Sort order, equality, uppercasing rules, case insensitivity,
accent (or more generally diacritic marks) insensitivity,
handling of double-letters (ss, ae, ch) and special characters
(space,-,=,@,#,...) - all these things are property of _operation_,
but the _fields_ should have possibility to specify suitable
default (as it is now using collation).

And it makes sense to mix different rules for single field, e.g.
I may want to search case insensitively (because I do not know
exact case the data are stored in), accent insensitively
(because the user is too lazy to use more than 26 keys),
but still want the result sorted in correct dictionary order.
At the same time, I need unique constraint for that field,
that is case insensitive, but accent sensitive.

Current syntax allows specifying collation for sorting

... ORDER BY MyField COLLATE win1250;

but e.g. for equality, the collation is specified for operands,
not for operation, e.g.

select * from mytab
where myfield collate pxw_csy = 'K' collate win1250;

in this case, only one collation will be used of course,
so perhaps the syntax should rather be

select * from mytab
where myfield = collate pxw_csy 'K';



Ivan