Subject | [IBO] Re: Fields with collation=PXW_SLOV consume 3 bytes per char? |
---|---|
Author | aleksander_oven |
Post date | 2003-10-14T13:43:40Z |
Helen,
fact that, assuming PXW_SLOV means Slovenian, this collation *should*
be binary. There are no >255 characters in our character set. Our
alphabet only uses like 5 non-english characters (times 2 for
upper/lower) which I thought were all part of the ANSI.
Their byte numbers are as follows:
Latin Letter S With Caron (upper=138, lower=154)
Latin Letter Z With Caron (upper=142, lower=158)
Latin Letter C With Acute (upper=198, lower=230)
Latin Letter C With Caron (upper=200, lower=232)
Latin Letter D With Stroke (upper=208, lower=240)
On the operating system level we generaly use either Windows Central
Europe or ISO-8859-2 charsets, which both contain all these characters.
With Firebird I'm using WIN1250 charset which is essentially Windows
Central European, IIRC.
So the problem is, I don't understand why any collation of such
character sets has to be multibyte in the first place.
Kind regards,
Aleksander Oven
> Someone can, but it's absolutely off-topic here. Just this once...I'm really sorry. It won't happen again. Thanks! :)
> You guessed right. The non-binary collations eat a lot of bytesHmmm... This is as I thought then. But what really puzzles me is the
> <snip>
fact that, assuming PXW_SLOV means Slovenian, this collation *should*
be binary. There are no >255 characters in our character set. Our
alphabet only uses like 5 non-english characters (times 2 for
upper/lower) which I thought were all part of the ANSI.
Their byte numbers are as follows:
Latin Letter S With Caron (upper=138, lower=154)
Latin Letter Z With Caron (upper=142, lower=158)
Latin Letter C With Acute (upper=198, lower=230)
Latin Letter C With Caron (upper=200, lower=232)
Latin Letter D With Stroke (upper=208, lower=240)
On the operating system level we generaly use either Windows Central
Europe or ISO-8859-2 charsets, which both contain all these characters.
With Firebird I'm using WIN1250 charset which is essentially Windows
Central European, IIRC.
So the problem is, I don't understand why any collation of such
character sets has to be multibyte in the first place.
Kind regards,
Aleksander Oven