Subject Re: [Firebird-Java] Re: UNICODE_FSS, internationalization issue
Author Blas Rodriguez Somoza
Hello

At 09/07/2003 15:10 +0000, Panagiotis Konstantinidis wrote:
>--- In Firebird-Java@yahoogroups.com, "Roman Rokytskyy"
><rrokytskyy@a...> wrote:
> > > No, I'm already using PreparedStatements, that is I use the
>setXXX()
> > > family of methods to set data for the query.
> >
> > Then it should not happen. Can you prepare a standalone test case
>that
> > reproduces this problem? You can extend
> > org.firebirdsql.jdbc.TestFBEncodings test case (it already contains
> > tests for WIN1251, WIN1252, and UNICODE_FSS encodings.
>
>Well now I'm really getting desperate... As if all the previous was
>not enough I noticed that select queries with LIKE, containing
>characters not included in the 8859_1 charset hang the database
>connection utilizing all cpu.... I do not use any extraordinary
>query.. Just a "SELECT * FROM table_name WHERE column_name
>LIKE 'some unicode chars here'"
>
>As before the file.encoding in JVM is in 8859_7. If I change that to
>UTF-8 it works ok.
>
>How does file.encoding effects the driver??? file.encoding should
>not affect driver since I have explicitly set character set in the
>database and in the jdbc connection...

Unicode is the character set in Java when we talk about a working java
program, but as soon as you do write or read from files you will use the
default character set, if not other specified.

For instance the encoding of a java source is the default encoding (if you
use characters outside of the default charset you need to define as a
unicode literal (with slashes).

This problem is typical in web applications before servlet 2.3. The web
server should identify the character set of the input, but some browsers
don't identify it correctly, so you need to use
servletRequest.setCharacterEncoding() method added in servlet 2.3.

From what you say is your problem it seems that the information with
unicode (including 8859_7 and 8859_1 characters) goes through some io
operation that convert 8859_1 characters not included in 8859_7 (the
default encoding) to unknown characters ('?').

The driver converts from the java internal character set (utf-8) to the
output, which will be the java equivalent of lc_type or the default
character set if the FB encoding is NONE.

Anyway as Roman says the best way to solve that is to get a test case.

Regards
Blas Rodriguez Somoza