Subject | Re: [firebird-support] Bug with character sets |
---|---|
Author | Dimitry Sibiryakov |
Post date | 2009-05-20T07:31:01Z |
>>> 1. What does Milan's code know that fbclient doesn't?From SQLVAR structure, but your assumption is wrong. One can't get
>> Length of string in characters.
>
> I understand that Milan derives that from 1) buffer size and 2) max
> number of bytes per character for the encoding used. The latter is
> deduced from a charset id. Where does he get that charset id?
right string length from buffer size and number of bytes per character.
>>> 2. Where does Milan's code get that info from?These tasks are absolutely the same.
>> From conversion from UTF8 to UTF16 (I have a feeling that Milan's
>> procedure can have incorrect results if encounter character with 4-bytes
>> code, though I most likely am wrong).
>
> Not really - that's how he does the trimming, not how he deduces the
> correct length in codepoints.
>>> 3. Why doesn't fbclient have that info?Yes, but it has no idea what this id means.
>> Because fbclient doesn't know what UTF8 is and how convert it into
>> UCS. FB client is rather simple tosser of data from transport packets
>> into client structures and back.
>
> Does fbclient know the charset id?
>>> 4. Can FB be changed so fbclient can get that info in the future, andIt already is provided, but there is no good for it.
>>> use it to trim the buffer to the right size before passing it to the
>>> client application?
>> Maybe, but so far nobody knows a good way to accomplish that.
>
> Can fbclient be provided with the charset id?
> I think you're wrong here. Any author of a library or framework shouldSo do authors of API envelopes such as IBProvider or FIB+. Rare
> really strive to make the interfaces intuitive and easy to understand.
> Anything else will cause a lot of frustration and a lot of unnecessary
> application bugs.
application today use API directly.
> But even if the answer to 4 is that there *is* an easy way to provideHere you are exactly right. That's why fbclient has no idea about
> fbclient with the necessary info to trim the result (i.e. the charset
> id), I'm not convinced fbclient should be bloated with the required
> capability to parse and trim all charsets available in Firebird.
character sets - it would require to add whole INTL module into it,
including ICU.
SY, SD.