Subject | Re: [firebird-support] Bug with character sets |
---|---|
Author | Kjell Rilbe |
Post date | 2009-05-19T19:47:10Z |
Ann W. Harrison wrote:
As far as I can see, and imho, the problem is that for char(N) data the
client doesn't receive enough info from the server to be able to pass
intuitively correct data to the client application. This is likely a
source of many bugs in application code, and this is bad.
Isn't it rather peculiar that for a char(N) utf8 column the client app
may receive a buffer containing anything from N to 4N characters and no
info about the value of N?
Now Dimitry Sibiryakov claims that fbclient doesn't receive enough info
from the server to be able to trim to the right number of codepoints (or
bytes for that matter), while Milan Babuskov actually does just that in
FlameRobin.
So:
1. What does Milan's code know that fbclient doesn't?
2. Where does Milan's code get that info from?
3. Why doesn't fbclient have that info?
4. Can FB be changed so fbclient can get that info in the future, and
use it to trim the buffer to the right size before passing it to the
client application?
The last one, item 4, would constitute "a real solution" imo.
N.B. I'm not convinced that fbclient should do the actual trimming,
considering it requires a lot of knowledge about the various encodings -
knowledge that I assume would imply an undesirable size increase in
fbclient. But maybe the struct returned from fbclient to application
code should be more explicit about the difference between returned
buffer size and actual number of codepoints contained. That should 1)
reduce the amount of client application bugs and 2) make it easier for
client application code to trim the buffer content correctly.
Kjell
--
--------------------------------------
Kjell Rilbe
DataDIA AB
E-post: kjell@...
Telefon: 08-761 06 55
Mobil: 0733-44 24 64
> A VARCHAR is passed as two bytes of character length, plusThat seems like a workaround and not a solution to me.
> that many characters. So the actual stored character length
> is passed for a VARCHAR. The buffer is fixed size and
> probably padded with spaces.
> >
> > Why is the CHAR buffer padded, but it seems VARCHAR isn't?
>
> Both are padded. VARCHAR has the character length. CHAR
> does not. Perhaps the easiest solution is to cast all strings
> to VARCHAR if you're using a multi-byte (and especially a
> variable length multi-byte) character set.
As far as I can see, and imho, the problem is that for char(N) data the
client doesn't receive enough info from the server to be able to pass
intuitively correct data to the client application. This is likely a
source of many bugs in application code, and this is bad.
Isn't it rather peculiar that for a char(N) utf8 column the client app
may receive a buffer containing anything from N to 4N characters and no
info about the value of N?
Now Dimitry Sibiryakov claims that fbclient doesn't receive enough info
from the server to be able to trim to the right number of codepoints (or
bytes for that matter), while Milan Babuskov actually does just that in
FlameRobin.
So:
1. What does Milan's code know that fbclient doesn't?
2. Where does Milan's code get that info from?
3. Why doesn't fbclient have that info?
4. Can FB be changed so fbclient can get that info in the future, and
use it to trim the buffer to the right size before passing it to the
client application?
The last one, item 4, would constitute "a real solution" imo.
N.B. I'm not convinced that fbclient should do the actual trimming,
considering it requires a lot of knowledge about the various encodings -
knowledge that I assume would imply an undesirable size increase in
fbclient. But maybe the struct returned from fbclient to application
code should be more explicit about the difference between returned
buffer size and actual number of codepoints contained. That should 1)
reduce the amount of client application bugs and 2) make it easier for
client application code to trim the buffer content correctly.
Kjell
--
--------------------------------------
Kjell Rilbe
DataDIA AB
E-post: kjell@...
Telefon: 08-761 06 55
Mobil: 0733-44 24 64