Subject | Re: [firebird-support] Right-padded char fields? |
---|---|
Author | Olivier Mascia |
Post date | 2008-09-02T06:04:31Z |
Kjell Rilbe a écrit :
be tricky. It would be much better to return the count of significant
bytes of the returned padded string. Wether the following bytes after
that count are initialized to spaces, to zeroes or uninitialized would
make no difference, but I'd vote for spaces just for compatibility.
This is absolutely not wanting CHAR to behave like VARCHAR. With all
non-variable length charsets, the reported count of significant bytes
in the buffer will be the buffer length. With UTF8 (is there another
one concerned?) that count may be lower than the buffer length.
By the way, I don't think the user of the API has use for the buffer
length. The buffer has been allocated by the API, if a fetch updates
the byte length to the (space padded) data length, it has all
information a consumer of the call might need. So 'sqllen' could very
well contain a byte count as today, but not the max byte count, the
used byte count. I hardly see where that could break existing code.
Unless we first encourage people to use tricky people of dividing the
max byte count by 4 to get the max character length which leads
nowhere and still require the use of UTF8 specifics primitives to copy
just the string and not the unused bytes at the end of the buffer.
I'm sorry to be so heavy on such an apparently small topic, but I
always have a hard time accepting such bugs as features of the past.
Nothing especially related to Firebird, nor personal with anybody, of
course.
--
Olivier Mascia
T.I.P. Group S.A.
http://www.tipgroup.com
> Olivier Mascia wrote:Sure it might cause problem with codepoint zero, that's why it would
>
>> Another tricky way would be to have the buffer zero right-padded if
>> the stored string uses less bytes than its maximum (even after space
>> right padding up to the declared count of characters).
>
> But that would cause problems if the actual data contains codepoint
> zero. Or is that an illegal character in all supported character sets?
> I'm pretty sure it isn't, but I might be wrong.
be tricky. It would be much better to return the count of significant
bytes of the returned padded string. Wether the following bytes after
that count are initialized to spaces, to zeroes or uninitialized would
make no difference, but I'd vote for spaces just for compatibility.
This is absolutely not wanting CHAR to behave like VARCHAR. With all
non-variable length charsets, the reported count of significant bytes
in the buffer will be the buffer length. With UTF8 (is there another
one concerned?) that count may be lower than the buffer length.
By the way, I don't think the user of the API has use for the buffer
length. The buffer has been allocated by the API, if a fetch updates
the byte length to the (space padded) data length, it has all
information a consumer of the call might need. So 'sqllen' could very
well contain a byte count as today, but not the max byte count, the
used byte count. I hardly see where that could break existing code.
Unless we first encourage people to use tricky people of dividing the
max byte count by 4 to get the max character length which leads
nowhere and still require the use of UTF8 specifics primitives to copy
just the string and not the unused bytes at the end of the buffer.
I'm sorry to be so heavy on such an apparently small topic, but I
always have a hard time accepting such bugs as features of the past.
Nothing especially related to Firebird, nor personal with anybody, of
course.
--
Olivier Mascia
T.I.P. Group S.A.
http://www.tipgroup.com