Subject | Re: [firebird-support] understanding characters sets |
---|---|
Author | Kjell Rilbe |
Post date | 2008-08-15T07:23:27Z |
Helen Borrie wrote:
It sounds as if I have a column with e.g. UTF8, then I must connect with
UTF8 for transliteration to work correctly.
Huh?
What if I have two columns, one with UTF8 and one with ISO 8859-1? Then
I'd have to use two different connections to get both transliterated
correctly - one connection for each character set.
This can't be how it's supposed to work, can it?
So, I must be missing something?
Before reading your post, I thought that if I have a connection with
e.g. UTF8, then all strings passed through that connection are assumed
to be UTF8, and if that data is going into a column with a different
character set, it will be transliterated to that character set.
(Excepting NONE and OCTETS).
On the other hand, this also seems problematic. In markd_mms' situation,
text pasted from web pages would probably often be in ISO 8859-1, but
might also be UTF8. So, he'd need to be able to send both character sets
to the DB in some way. In general, an application might need to use
different character sets for input to different columns. If the all
strings have to be sent in the connection's character set, that would be
problematic.
So, how is this actually supposed to work, and how does it currently
work? (Which might be two different things...)
Kjell
--
--------------------------------------
Kjell Rilbe
DataDIA AB
E-post: kjell@...
Telefon: 08-761 06 55
Mobil: 0733-44 24 64
> Strings are written to the database using the defined character set,I do not fully understand this.
> which will be either the default character set defined for the database
> or, if present, the character set defined for the column they are
> written to. For string data to be transliterated correctly for both
> writing and reading, the connection character set must be the same as
> the destination character set.
It sounds as if I have a column with e.g. UTF8, then I must connect with
UTF8 for transliteration to work correctly.
Huh?
What if I have two columns, one with UTF8 and one with ISO 8859-1? Then
I'd have to use two different connections to get both transliterated
correctly - one connection for each character set.
This can't be how it's supposed to work, can it?
So, I must be missing something?
Before reading your post, I thought that if I have a connection with
e.g. UTF8, then all strings passed through that connection are assumed
to be UTF8, and if that data is going into a column with a different
character set, it will be transliterated to that character set.
(Excepting NONE and OCTETS).
On the other hand, this also seems problematic. In markd_mms' situation,
text pasted from web pages would probably often be in ISO 8859-1, but
might also be UTF8. So, he'd need to be able to send both character sets
to the DB in some way. In general, an application might need to use
different character sets for input to different columns. If the all
strings have to be sent in the connection's character set, that would be
problematic.
So, how is this actually supposed to work, and how does it currently
work? (Which might be two different things...)
Kjell
--
--------------------------------------
Kjell Rilbe
DataDIA AB
E-post: kjell@...
Telefon: 08-761 06 55
Mobil: 0733-44 24 64