Subject | Re: UTF8, malformed string error |
---|---|
Author | Roman Rokytskyy |
Post date | 2007-02-23T13:22:05Z |
> Well, explicitly calling an "long string to UTF8 string" works, theYou have to call "UTF8 string to long string" conversion routine then :)
> strings get inserted.
>
> Reading them fails and gets you wrong charactersets, IBO doesn't
> seem to support all that UTF8 stuff.
> What does Firebird expect here? Should clients transform the serverFirebird will translate characters from the charset of the particular
> side chars to something at the client? I guess, when they request it
> from the server with UTF8, it retursn them as such. I just connected
> with ISO8859_1 and now the strings are returned correctly, so that
> would be a server side translation, correct?
field (or default db charset) into the charset that was specified in
lc_ctype property in DPB. If nothing was specified, lc_ctype=NONE and
data are stored and read "as is" (note, very likely it won't work for
new UTF8 charset, though UNICODE_FSS will "swallow" it).
> Either way -- there's another problem.Specify the lc_ctype when connecting to the database. If all your
>
> How do I know when to call these, given the generic nature of this
> application?
applications do this and database has correctly specified charsets in
fields, you will always obtain data according to your lc_ctype.
> All I know is "string" or "widestring" in Delphi, nothing aboutIf I'm not mistaken, the wide string is UCS2 (Unicode, 2 bytes per
> encodings.
character). Theoretically IBO could have AsWideString property and
perform the conversion according to what you have specified in DPB.
Also, I did not check the IBO sources, but I am pretty sure that
AsString property simply converts the specified string bytewise into
byte array. And now there's a problem - somewhere there (I guess in
IBO) the 0x00 character is considered to be string terminator. Now,
when you assign the wide string, it always has 2 bytes per character
and on little-endian platform that would be something like 0x65 0x00
0x66 0x00 0x67 0x00 and so on.
Now, when you give that string to IBO, somewhere on the way some
component treats 0x00 as C-string terminator and stops processing
data. I am almost 100% sure that is not fbclient and neither the
database engine, since the XSQLVAR have two properties: sqllen and
sqldata. And the sqllen contains the length of the passed string in
bytes (and it must correspond to what we try to send to the server). I
do not remember exactly, but one of the test cases in our JDBC driver
inserts 0x00 in the middle of the byte array into the database and
then reads it back. That works in pure Java protocol version and when
we use fbclient.dll/fbembed.dll.
Hope this helps.
Roman