Subject | RE: [firebird-support] Unicode size |
---|---|
Author | David Johnson |
Post date | 2004-12-15T23:19:57Z |
Uppercase and collation order is locale dependent for even relatively
simple non-ascii characters.
For example, the A with a circle on top (Angstrom symbol) is treated
identically to an A in English, but is its own character in Norwegian
and Swedish. In Norwegian, it comes between A and B, but in Swedish it
comes after Z.
In German, the greek letter beta is used as a standard representation of
a long S sound. It is identical in collation order and semantics to
"ss". It is written only for lower case, and replaced by SS when
capitalized.
Don't even start looking at Tibetan! It is, like Chinese, an
ideographic language. However, it is proper to superimpose ideograms on
top of each other under several circumstances. It is no wonder that
software is not written to support the Tibetan language.
Short form: you cannot depend on the OS to know the capitalization and
collation rules for your language.
simple non-ascii characters.
For example, the A with a circle on top (Angstrom symbol) is treated
identically to an A in English, but is its own character in Norwegian
and Swedish. In Norwegian, it comes between A and B, but in Swedish it
comes after Z.
In German, the greek letter beta is used as a standard representation of
a long S sound. It is identical in collation order and semantics to
"ss". It is written only for lower case, and replaced by SS when
capitalized.
Don't even start looking at Tibetan! It is, like Chinese, an
ideographic language. However, it is proper to superimpose ideograms on
top of each other under several circumstances. It is no wonder that
software is not written to support the Tibetan language.
Short form: you cannot depend on the OS to know the capitalization and
collation rules for your language.