Subject Re: Performance bug located and solved
Author Roman Rokytskyy
Hi,

> First at all, Interclient uses also a default charset not NONE
> but ISO8859_1, this the reason why I do that.

Maybe version 1.6 did. If you get InterClient 2.0 sources from CVS,
you will see that "NONE" is the default character set. However, they
do set it explicitely.

> > We had this issue before. There were a code when lc_ctype was
> > determined from "file.encoding" property and there were a lot of
> > problems with it. I strongly recommend "NONE".
> >
>
> Of course if we find an error I can't solve, I will rollback
> the change, but let me try to solve the problems if they come.
>
> Perhaps if you can tell me which problems, I can analyze it. I
> think the performance difference justify some work.

No, I was not talking about rolling back your work. You did a great
job and 10-20% increase of performance is very important.
However, "NONE" is the default character set encoding in the
documentation, so let's use it.

Problem with "NONE" is, that if you have database with either default
character set being "NONE", or column definition being "NONE" and you
connect with the client with non-"NONE" client encoding, read access
of "NONE"-fields will fail. Reason is that server is not able to
translate "NONE" character set into non-"NONE" character set (for
example "ISO8859_1"). Also, if you client encoding is "NONE" and you
try to write into non-"NONE" column, request will fail for the same
reason.

Now, if people have databases with "NONE" as default character set
(quite common case), because they store only ASCII characters,
setting client encoding to something different from "NONE" will
prevent them reading back the information from the database. I'm not
sure if they will like this.

If you want to play with encodings yourself, use TestFBEncodings.java
and try to change client encoding to "NONE" or try to
read "none_field" when your client encoding is not "NONE".

> >
> > Sorry, JayBird is a client software, this means that server
determines
> > what and how driver sends to server and not vice versa.
> >
>
> In the remote protocol of firebird, if I'm not wrong, is the client
> who decide which character set to use, the same that happens
> with the dialect, or the protocol. The server only validates is the
> request was wrong or not. The characterSet the client send to
> the server are used for
> - Communicate to the server which characterSet it receives, if the
> server don't know that, it seems it verify the charset.
> - Request from the server which characterset wants in the data
> sended from the server to the client.

I totally agree with you here. I was objecting your phrase that in
order to get the best performance people have to convert their
databases into encoding corresponding to default Java encoding. Now I
understand what you mean. Yes, I agree, using the same encoding in
database and client will save some CPU cycles.

> According to the code (jrd.cpp) this option is only available since
> Interbase 4.x.

And that was ~1997-98, 5-6 years ago. Quite a long period of time :)

Best regards,
Roman Rokytskyy