Subject Re: [Firebird-Architect] A Fresh Look at Collations
Author Jim Starkey
Adriano dos Santos Fernandes wrote:
> On 21/06/2010 12:08, Dimitry Sibiryakov wrote:
>
>> 21.06.2010 16:46, Sergey Mereutsa wrote:
>> AWH>> Why do you say UTF-8 is slow?
>>
>>
>>> Because you can not count string length, for example, without walking
>>> it all - because each char in UTF8 (if we speak about it native
>>> representation) can be from 1 to 6 bytes length.
>>>
>>>
>> But is counting of symbols so frequent operation to care about its
>> speed?..
>>
>>
> At least when data come from user, it must be validated and characters
> counted (when using constrained length strings - aka [VAR]CHAR).
>
> If you don't do this things, it will be like Interbase and FB 1.5. It is
> then better to call it bytes instead of UTF-8.
>
>

That a client side operation. If the server doesn't trust the client,
it can validate incoming utf8 strings, but even that is a cheap operation.

Something that I should have mentioned in passing, incidentally, is that
all strings in NimbusDB are arbitrary length, so there aren't issues any
issues of logical versus physical string lengths.

--
Jim Starkey
Founder, NimbusDB, Inc.
978 526-1376



[Non-text portions of this message have been removed]