Subject Re: [firebird-support] Problem with UPPER() when using Greek Charsets
Author Helen Borrie
At 07:42 AM 11/08/2004 +0000, you wrote:
>Hi,
>When using the upper() function in a varchar column containing Greek
>data (i.e. ISO-8859-7, UNICODE, WIN1253), we see no effect in the
>original string.
>Example
>select upper('My LowerCase string in Greek') from test
>it returns always the original string.
>
>Is there any way to overcome this problem? We think that Firebird is
>superb, but this (if we cannot resolve it) is a major weakness (at
>least for Greek Customers)...

Our "Greek Guru" has gone to Athens for the Games so it's going to be some
time before we hear from him. :-)

I'll try to start it off but, not knowing myself what any Greek capital
letter should look like, can't test for you...

First, UNICODE_FSS is a very blunt instrument. It can store upper-case
Greek characters, but it has no knowlege of the mappings between a
lower-case character and its upper-case equivalent, for any language-symbol
set.

ISO8859-7, with the default (and only available) COLLATE sequence
ISO8859-7, and WIN1253, with the default COLLATE sequence WIN1253, are
binary collations. This is governed by a rule that all the default
collations (the ones with names matching the character set names) are
binary: they have all the characters, but they don't know the upper/lower
mappings.

In the case of ISO8859-7 (and some of the other ISO sets) I'm not
absolutely certain that the binary rule actually applies to the default
COLLATE sequence. It would be worth trying, at least, to test whether
UPPER gives you uppercasing for that set.

The thing to try would be WIN1253 with the COLLATE sequence PXW_GREEK. If
the column itself has not been defined with COLLATE PXW_GREEK then try to
apply the collation to the output, as follows:

select upper('My LowerCase string in Greek') COLLATE PXW_GREEK from test

If that works, then you might like to consider rebuilding the table
definitions of tables that you know you need uppercasing for. It's best to
avoid imposing non-binary collations on columns where they are not needed,
because they add more limits to the widths of indexes.

Let us know what you find, because this is a poorly documented area.

/heLen