Subject | Re: lc_ctype support |
---|---|
Author | rrokytskyy |
Post date | 2002-04-09T23:01:49Z |
> I also agree with Roman :))) Although I wanted to show that there:) Ok, but, as usual, I try to find the way how to solve that very
> are problems that simply can not be solved with current solution.
problem with current solution. lc_ctype=UNICODE_FSS works, so I need
more real-world examples. :)
> r> If we agree, the client-side encoding is always UNICODE_FSSWell, check my latest unit test version. :)
> r> (which is not too bad I think), then we really do not need all
> r> this conversion.
> r> Firebird will do (or is supposed to do) this automatically. But
> r> we then have to require people to set at least the default
> r> database encoding. We just pass the result of str.getBytes
> r> ("UTF8") (not str.getBytes() because this will not be a unicode
> r> stream) to engine.
> r> And this is the responsibility of the engine to store data
> r> according to database/column definition.
>
> I'll be very happy if that works!!!!
> r> Therefore we need encodings.Default encoding is not always ISO-8859-1. In my case it is Cp1252 if
>
> I've never said we don't need them. They are really important, your
> example is a really good one. In many cases I want to switch them
> off, because *many* Java subsystem does not care about the default
> encoding, but they use ISO-8859-1.
I have Germany in regional settings and Cp1251 in case of Ukraine.
> r> The main idea in the discussion is whether to provide or not anRight. Note, that byte[] contains data in the encoding specified by
> r> option not to convert using this way: national_unicode-
> >>correct_unicode->connection_encoding->byte[], but
> r>national_unicode-
> >>byte[] directly if we assume that national_unicode and
> r> connection_encoding are the same.
>
> Actually now the driver does the following:
> when you read from the database:
> byte[] -> correct_unicode
lc_ctype.
> when you write to the database:Wrong. Only correct_unicode -> byte[], and byte[] contains data in
> correct_unicode -> national_unicode -> byte[]
encoding specified by lc_ctype. "national_unicode" involves default
JVM encoding, and this was used before, but not after the lc_ctype is
present.
> but it requires you to pass correct_unicode. correct to nationalIt is definitely bad idea to have lc_ctype=WIN1252 and write to
> conversion is done by using lc_ctype, but probably that is not good
> when you need to write a column with different character set from
> lc_ctype.
WIN1251 column. You will get an exception from the Firebird. But, if
you specify the lc_ctype=UNICODE_FSS, you are able to write to
WIN1251 and WIN1252 columns simultaneously.
> I can ask again what to do with columns with different characterUse lc_ctype=UNICODE_FSS. :)
> sets.
> Everything can be implemented ;) Moreover only one boolean fieldI had problems with such solution. However, instead of adding this to
> should be added to FBManagedConnection, and when is set NONE should
> be returned by getIscEncoding:
>
> public String getIscEncoding() {
> *****
> if (pleaseDontConvertMyAlreadyConvertedString)
> return "NONE";
> *****
>
>
> try {
> String result = cri.getStringProperty
> (GDS.isc_dpb_lc_ctype);
> if (result == null) result = "NONE";
> return result;
> } catch(NullPointerException ex) {
> return "NONE";
> }
> }
FBManagedConnection, I just commented code in FBField. And 0xF5 char
is converted to '?' by _JVM_ in the call str.getBytes(). I repeat
again, my JVM default encoding is Cp1252. Do you want such behaviour?
I doubt.
> That's all. The option can be passed with the Properties in theThis might work in your case, but definitely does not in mine. And
> connect() in FBDriver.
the problem here is uncertainty introduced by the JVM default
encoding that cannot be controlled.
Best regards,
Roman Rokytskyy