Subject Re: [Firebird-Architect] Re: UTF-8 vs UTF-16
Author Jim Starkey
Roman Rokytskyy wrote:

>>>I think that Firebird engine could (and should) become completely
>>>UNICODE-16. No zoo of charsets, no multibyte encodings with a sword
>>>of Damocles of buffer overflows.
>>>
>>>
>
>c) Current limitation of CHAR and VARCHAR columns is ~32,000 bytes.
>Saying that each character needs 2 bytes, you decrease this boundary
>to ~16,000.
>
Or fix the length restriction... Remember that the engine was written
when workstations tended
to max out at 3 or 4 megs. Those days are long gone. Go hog wild and
change all the shorts to
longs and live free.

The more serious problem is indexes. The architectural limit of 255
bytes was a serious mistake
that I made over and over. Switching to a variable length prefix and
length codes would eliminate
the restriction at virtually no cost (actual keys less than 128 bytes
would remain the same length).

It's probably infeasible to switch completely to unicode, but a 16 bit
string datatype would go a
long way towards fixing the problem.


[Non-text portions of this message have been removed]