Subject Re: [firebird-support] UTF8 in firebird ?
Author Mark Rotteveel
On 6-1-2012 10:47, Vander Clock Stephane wrote:
>
>>> yes, at least some options in the database (or in the create statement)
>> to
>>> define the size in byte of 1 UTF8 char
>>>
>>> For exemple by default 1 utf8 char = 4 bytes (like it is now) and i can
>>> be able to
>>> customize it to be egual to 1 bytes.
>>
>> Then it is no longer UTF-8.
>>
>
> i thing you have a misunderstanding, because utf8 if "FROM" 1 to 6 bytes
> (or even more in the theory) !
> so why 3 (or 4) bytes will be utf8 and not 1 or 2 ??

As I said: in theory UTF-8 could encode to more than 4 bytes as it was
designed to do that, but the standard committee decided not to use more
than 4 bytes. So: the encoding scheme that UTF-8 uses *could* use more
bytes, but the UTF-8 standard does *not allow* use of more than 4 bytes.

So, as there are upto 4 bytes encoded, Firebird reserves 4 bytes per
character.

--
Mark Rotteveel