Subject | Re: [firebird-support] UTF8 in firebird ? |
---|---|
Author | Mark Rotteveel |
Post date | 2012-01-07T09:07:46Z |
On 6-1-2012 11:07, Vander Clock Stephane wrote:
for the character in the unicode set. Its value is in the range of 0x00
and 0x10FFFF. Only when you apply an encoding like UTF-8, UTF-16 or
UTF-32 are they translated into actual bytes for storage. As it stands
you require a minimum and maximum of 3 bytes to store unicode codepoints
as is without encoding.
go in in any connection character set are stored as is), the padding
character is 0x00 instead of 0x20.
--
Mark Rotteveel
> of course i was speaking about codepoint ! not (yet) so crazy toUnicode codepoints are not bytes, but the 'abstract' numeric identifiers
> thing i can put all the symbols in earth in 1 bytes :)
> my index work perfectly, my sorting no (and off course) !
> this why i write this paper about utf8 if not i will stay with
> my ISO88598_1 column and everything will be perfect
for the character in the unicode set. Its value is in the range of 0x00
and 0x10FFFF. Only when you apply an encoding like UTF-8, UTF-16 or
UTF-32 are they translated into actual bytes for storage. As it stands
you require a minimum and maximum of 3 bytes to store unicode codepoints
as is without encoding.
>> If you want simple byte storage and to hell with properFor character set OCTETS no transliteration is applied (the bytes that
>> unicode character collation then use character set OCTETS.
>>
> OCTECTS or iso8859_1 it's the same in fact ... still need to go like
> you say in the hell of proper unicode character collation in
> both case :(
go in in any connection character set are stored as is), the padding
character is 0x00 instead of 0x20.
--
Mark Rotteveel