Subject | Re: [firebird-support] accent insensitive collation - performance |
---|---|
Author | Geoff Worboys |
Post date | 2016-08-25T02:14:06Z |
Geoff Worboys geoff@... [firebird-support] wrote:
is between 3 and 4 times slower than just UNICODE_CI. So it
seems accent insensitive has a definite and significant cost.
Maybe it's obvious to people in the know but it surprises me.
--
Geoff Worboys
Telesis Computing Pty Ltd
> Hi all,[...]
> Firebird v2.5.6 - superserver. Windows 10 x64.
> I was making some changes to a database when I noticed a query
> like this:
> SELECT TEXTBLOB
> FROM TABLEA WHERE TEXTBLOB CONTAINING 'SOME TEXT'
> was suddenly taking about 4 x longer than before. (A table of
> almost a million records, so it was quite distinct and very
> consistent.)
> It turns out the difference was in my collation declaration.
> The new database had this:Well, after testing with a UTF8 based build, even UNICODE_CI_AI
> CREATE COLLATION WIN1252_UNICODE
> FOR WIN1252;
> CREATE COLLATION NOCASE
> FOR WIN1252
> FROM WIN1252_UNICODE
> CASE INSENSITIVE
> ACCENT INSENSITIVE;
> Rebuilt the new database without "accent insensitive" and the
> performance matched the old database.
> Is the performance hit with ACCENT INSENSITIVE to be expected
> in this case?
> I have been considering migrating to UTF8 (but keep putting it
> off because I've quite a bit of other work to do before that
> is possible.) Does anyone know if I will get a similar
> performance hit with ACCENT INSENSITIVE if I use UTF8 and the
> predeclared UNICODE_CI_AI collation?
is between 3 and 4 times slower than just UNICODE_CI. So it
seems accent insensitive has a definite and significant cost.
Maybe it's obvious to people in the know but it surprises me.
--
Geoff Worboys
Telesis Computing Pty Ltd