Subject | character sets problems and i18n |
---|---|
Author | Daniel Albuschat |
Post date | 2004-06-25T11:14:01Z |
Hello,
I'm quite upset about the handling of character sets in our current
database... We're developing a german software (but we will add i18n
support quite soon) and had some problems with default character sets in
early days, so we set the character set for every text-field
explicitly. I think this was a mistake... anyways, if you forget to set
the character set, the application (and to some extent the database, as
described below) will break when you input umlauts or other special
characters. That's quite tedious since it happens from time to time and
can break big parts of the application.
And I think it'll be a nightmare if we use the software with different
languages and therefore different character sets.
Anyways, the actual problem I have at the moment is quite simple to
describe:
There's a table `A' with a field `text' of type `varchar(40)'.
Someone put an umlaut in this field and now you get the
error message `Cannot transliterate character between character sets'
every time you access the table in some way. IIRC (I'm not very sure)
it worked before to create a temporary field of type
`varchar(40) character set iso8859_1', copy the contents of `text'
to this temporary field with `update A set tmp_text=text;', drop text
and recreate it with the right character set, and topy tmp_text to
text again. This time I get the `Cannot transliterate [...]' error
message again on the first update, though. I couldn't find a way
to preserve the data and update the field's character set, yet.
Is there any way to fix this error?
And why does firebird allow to insert this value anyways?
Furthermore, I'd like to know your advice on how to handle the
internationalisation of the database. I think utf-8 is the
only way to go... there has been discussion about always using
utf-16 in firebird, iirc. Anyone know if the idea has been
accepted/declined?
Thanks for your time,
Daniel Albuschat
--
eat(this); // delicious suicide
I'm quite upset about the handling of character sets in our current
database... We're developing a german software (but we will add i18n
support quite soon) and had some problems with default character sets in
early days, so we set the character set for every text-field
explicitly. I think this was a mistake... anyways, if you forget to set
the character set, the application (and to some extent the database, as
described below) will break when you input umlauts or other special
characters. That's quite tedious since it happens from time to time and
can break big parts of the application.
And I think it'll be a nightmare if we use the software with different
languages and therefore different character sets.
Anyways, the actual problem I have at the moment is quite simple to
describe:
There's a table `A' with a field `text' of type `varchar(40)'.
Someone put an umlaut in this field and now you get the
error message `Cannot transliterate character between character sets'
every time you access the table in some way. IIRC (I'm not very sure)
it worked before to create a temporary field of type
`varchar(40) character set iso8859_1', copy the contents of `text'
to this temporary field with `update A set tmp_text=text;', drop text
and recreate it with the right character set, and topy tmp_text to
text again. This time I get the `Cannot transliterate [...]' error
message again on the first update, though. I couldn't find a way
to preserve the data and update the field's character set, yet.
Is there any way to fix this error?
And why does firebird allow to insert this value anyways?
Furthermore, I'd like to know your advice on how to handle the
internationalisation of the database. I think utf-8 is the
only way to go... there has been discussion about always using
utf-16 in firebird, iirc. Anyone know if the idea has been
accepted/declined?
Thanks for your time,
Daniel Albuschat
--
eat(this); // delicious suicide