Subject Re: [Firebird-Java] Re: Character Sets & Accessing Firebird using Jaybird
Author Andreas Garzotto
> Following happens: JayBird receives data from server and converts them
> into string using new String(byte[]) constructor. JVM tries to convert
> data into Unicode representation and for some characters it fails and
> replaces them with '?'. Why? Because default encoding for the JVM does
> not have that characters. How to solve? Either set correct system-wide
> character set (I suspect your JVM runs on Linux and LANG or LC_CTYPE
> is "C", set it to "ISO-Latin-1") or pass it with
> -Dfile.encoding=Cp1252 for example.
>
> To verify what I'm saying you can do following: use
> ResultSet.getBytes() method instead of getString() and save those byte
> arrays into some file. Then open it with a java.io.Reader and check
> the string you get.
[...]
> well, if there is a ResultSet.getBytes() then try to convert it with
> "new String(ResultSet.getBytes,"ISO8859-1")"

Thanks to everybody. I can confirm that Jaybird indeed does pass data from
the database correctly. ResultSet.getBytes() returns 8-bit data. If
converted explicitly using new String(ResultSet.getBytes,"ISO8859-1", the
String is correct.

However, I still cannot make it work using ResultSet.getString(). An
obvious workaround would be to use getBytes instead of getString. However,
we are using a library (Jakarta Torque) to access the data and so this is
not really an option.

The way to go seems to be to teach the JVM the default character set. It
appears to me that something like -Dfile.encoding=ISO-8859-1 is ignored
under Blackdown Java 1.4.2. export LC_CTYPE=ISO-Latin-1 also does not seem
to have any effect (under SuSE Linux 8.1). This is no longer a topic for
this list, I guess, but if anyone can give me a hint...


Andreas