Subject | Re: [firebird-support] understanding characters sets |
---|---|
Author | Kjell Rilbe |
Post date | 2008-08-15T09:12:59Z |
Helen Borrie wrote:
to the connection's character set, regardless what character set is used
in each DB column. Is this correct?
So, if a DB contains one UTF8 column and one ISO8859_1 column, and the
client connects with UTF8, then the application has to send the
ISO8859_1 data in UTF8 format, even if the data actually entered into
the client application already is ISO8859_1.
Would it be possible to select connection character set on a per column
or per parameter basis?
It seems to me that it would be useful to be able to do so.
On the other hand, you could always have such application connect with
NONE and make sure it's aware of each column's character set. That way
it would be able to send data in different character sets for different
columns. Right?
And, what about system tables? I previously had some problems with
column names with non-ASCII characters (Swedish ÅÄÖåäö). I think I got
the answer that all system tables use UTF8 for their string data. In
that case, I suppose all identifiers in DDL and DML statements should be
assumed to be in the connection character set, and FB would
transliterate to UTF8. Is this actually the case?
Kjell
--
--------------------------------------
Kjell Rilbe
DataDIA AB
E-post: kjell@...
Telefon: 08-761 06 55
Mobil: 0733-44 24 64
> No. It's a problem when the input coming from the client is notOK. I now I understand that the client application has to transliterate
> well-formed for *either* character set (the connection charset or the
> destination column's charset). For a simple example, say the OP did a
> copy from website text that was ASCII with some 8-bit character images
> rendered from &xxx; elements in the HTML. The character stream that he
> pastes into waiting parameters in the client cannot possibly be
> well-formed and malformed strings can't be transliterated. Data being
> collected in this fashion off a web page will be particularly prone to
> this. You can get a reasonable indication of whether the data you are
> copying in this fashion is well-formed or is more like alphabet soup by
> inspecting what the rendered characters look like in the page source....
to the connection's character set, regardless what character set is used
in each DB column. Is this correct?
So, if a DB contains one UTF8 column and one ISO8859_1 column, and the
client connects with UTF8, then the application has to send the
ISO8859_1 data in UTF8 format, even if the data actually entered into
the client application already is ISO8859_1.
Would it be possible to select connection character set on a per column
or per parameter basis?
It seems to me that it would be useful to be able to do so.
On the other hand, you could always have such application connect with
NONE and make sure it's aware of each column's character set. That way
it would be able to send data in different character sets for different
columns. Right?
And, what about system tables? I previously had some problems with
column names with non-ASCII characters (Swedish ÅÄÖåäö). I think I got
the answer that all system tables use UTF8 for their string data. In
that case, I suppose all identifiers in DDL and DML statements should be
assumed to be in the connection character set, and FB would
transliterate to UTF8. Is this actually the case?
Kjell
--
--------------------------------------
Kjell Rilbe
DataDIA AB
E-post: kjell@...
Telefon: 08-761 06 55
Mobil: 0733-44 24 64