Subject | Re: [firebird-support] Changes in CharSet during Migration from FB 1.5.6 to 2.5.x |
---|---|
Author | Michael Ludwig |
Post date | 2011-03-24T22:05:43Z |
Serdar Gül schrieb am 23.03.2011 um 10:01 (+0200):
statement taken from Ivan Prenosil's page:
It is not possible to directly convert string in NONE character set
into string that is e.g. ISO8859_1 (unless it contains only ASCII
characters); otherwise you get the famous "Cannot transliterate
character between character sets" error.
NONE and OCTETS store only octets. Any octet is valid in ISO8859_1.
Hence there should never be any error coverting from NONE or OCTETS
to ISO8859_1.
The same is not true for ASCII and Unicode stored as UTF-8.
In ASCII, characters above 127 aren't allowed. So octets above 127
should trigger an error. [1]
In Unicode, all characters are allowed. However, if the octet sequence
stored on disk isn't altered during a transcoding operation via the
system tables, the operation must fail if the octet sequence isn't
valid as UTF-8.
[1] I managed, however, to insert non-ASCII characters in a column of
ASCII character set. Not sure it is, but it looks like a bug to me.
Will post a follow-up.
--
Michael Ludwig
> http://www.volny.cz/iprenosil/interbase/ip_ib_quiz.htm#_quiz_3That's true, it does.
>
> in web site first title explains the problems for your demand
> 2011/3/23 Eduardo A <eas@...>I might be wrong, but I think there's a problem with the following
> > Our databases were created with CharSet NONE.
> >
> > As we get ready to migrate and also update our application to
> > UniCode, what are the steps we must use to ensure that our hundreds
> > of databases in the field do not lose their stored character data?
statement taken from Ivan Prenosil's page:
It is not possible to directly convert string in NONE character set
into string that is e.g. ISO8859_1 (unless it contains only ASCII
characters); otherwise you get the famous "Cannot transliterate
character between character sets" error.
NONE and OCTETS store only octets. Any octet is valid in ISO8859_1.
Hence there should never be any error coverting from NONE or OCTETS
to ISO8859_1.
The same is not true for ASCII and Unicode stored as UTF-8.
In ASCII, characters above 127 aren't allowed. So octets above 127
should trigger an error. [1]
In Unicode, all characters are allowed. However, if the octet sequence
stored on disk isn't altered during a transcoding operation via the
system tables, the operation must fail if the octet sequence isn't
valid as UTF-8.
[1] I managed, however, to insert non-ASCII characters in a column of
ASCII character set. Not sure it is, but it looks like a bug to me.
Will post a follow-up.
--
Michael Ludwig