Subject Re: [firebird-support] Re: Using unicode versus WIN1252 (Firebird 2)
Author Milan Babuskov
Fulvio Senore wrote:
> Milan Babuskov ha scritto:
>>> German characters mostly have ASCII < 7F. With UTF8 these characters
>>> have the same storage size as in WIN1252, right?
>>>
>> Only when represented in UTF8 form. However, Firebird internally
>> represents them as 4-bytes-per-character internally.
>>
> Do you mean that the same ASCII string takes much more space when the
> column is encoded as UTF8 than when it is encoded with a single byte
> encoding like WIN1252?

Yes.

> I upgraded a program to use UTF8 text fields to help with international
> characters and a test database containing almost only ASCII text fields
> almost doubled its size from 12 to more than 20 MB.

This is expected behavior.

--
Milan Babuskov
http://www.flamerobin.org
http://www.guacosoft.com