Subject | Re: [Firebird-Java] Approaches at JB-to-FB conenctions regarding charsets |
---|---|
Author | Mark Rotteveel |
Post date | 2012-06-29T16:01:12Z |
On 29-6-2012 17:07, Roman Rokytskyy wrote:
characterset is. It will always work (sort of). Connecting with NONE to
a UTF8 database will result in transliteration errors when you uses
characters outside of ASCII. But connecting with UTF8 to a NONE database
will just work. You might get malformed text back though.
I connected using NONE to a NONE database and added two rows with
ascii and Hóóg, and then connecting with UTF8 and added two more rows
with the same content.
If I query this back I get:
Connected with UTF8
ascii
H?
ascii
Hóóg
Connected with no explicit charset
ascii
Hóóg
ascii
Hóóg
Here I think the ? is a byte combinations which is invalid.
(my local system encoding is win1252 btw)
No error, but I do have logical corruption one way or the other.
Now if I do the same with a win1252 database I get
Connected with UTF8
ascii
Hóóg
ascii
Hóóg
Connected with no explicit charset
ascii
Hóóg
ascii
Hóóg
On the other hand if I accidentally used a win1251 database I fail to
insert Hóóg when connected with UTF8, but on retrieval I get:
ascii
H??g
ascii
Connected with no explicit charset
ascii
Hóóg
ascii
Note: the ? is a result of the console being win1252, if I had used a
component capable of displaying unicode it would actually show \u0443
(or 0xF3 in Windows-1251).
Mark
--
Mark Rotteveel
> There will be no data corruption. Server will fail before - it won'tIf your database is NONE it really doesn't matter what your connection
> know how to convert NONE to UTF8, so no data will be delivered to the
> client.
characterset is. It will always work (sort of). Connecting with NONE to
a UTF8 database will result in transliteration errors when you uses
characters outside of ASCII. But connecting with UTF8 to a NONE database
will just work. You might get malformed text back though.
I connected using NONE to a NONE database and added two rows with
ascii and Hóóg, and then connecting with UTF8 and added two more rows
with the same content.
If I query this back I get:
Connected with UTF8
ascii
H?
ascii
Hóóg
Connected with no explicit charset
ascii
Hóóg
ascii
Hóóg
Here I think the ? is a byte combinations which is invalid.
(my local system encoding is win1252 btw)
No error, but I do have logical corruption one way or the other.
Now if I do the same with a win1252 database I get
Connected with UTF8
ascii
Hóóg
ascii
Hóóg
Connected with no explicit charset
ascii
Hóóg
ascii
Hóóg
On the other hand if I accidentally used a win1251 database I fail to
insert Hóóg when connected with UTF8, but on retrieval I get:
ascii
H??g
ascii
Connected with no explicit charset
ascii
Hóóg
ascii
Note: the ? is a result of the console being win1252, if I had used a
component capable of displaying unicode it would actually show \u0443
(or 0xF3 in Windows-1251).
Mark
--
Mark Rotteveel