Subject | Re: [Firebird-Architect] UTF-8 over UTF-16 WAS: Applications of Encoded Data Streams |
---|---|
Author | Jim Starkey |
Post date | 2005-05-03T02:56:22Z |
Svend Meyland Nicolaisen wrote:
that UTF-8?
Part of the motivation toward developing the data stream encoding is to
move away from pre-allocation and other assumption concerning physical
length altogether.
>I have lately wondered why UTF-8 generally seem to be preferred over UTF-16?Do you have any statistical data to show that UTF-16 consumes few bytes
>I can understand the use of UTF-8 in applications that need to maintain
>backward compatibility with the US-ASCII character set and/or mainly uses
>characters from the US-ASCII character set.
>Also if you need to "allocate" space for an X character wide text field in a
>database like FireBird, I would think that you need to allocate space for
>the worst case scenario which is 4 times X for both UTF-8 and UTF-16. So the
>potential compressions of UTF-8 dosn't help much here.
>
>
>
that UTF-8?
Part of the motivation toward developing the data stream encoding is to
move away from pre-allocation and other assumption concerning physical
length altogether.