firebird-support - Re: [firebird-support] Very slow query if collation is UNICODE_CI

Subject	Re: [firebird-support] Very slow query if collation is UNICODE_CI_AI
Author	Thomas Steinmaurer
Post date	2012-11-10T10:05:42Z

> I have a problem with some tables, which seem to have to do with the collation...
>
> 1. case:
> ========
> Database with UTF8 as character set has no default collation, collumns have UNICODE as collation:
>
> CREATE TABLE POSTALCODES
> (
> ID INTEGER NOT NULL COLLATE UNICODE,
> COUNTRY VARCHAR( 3) NOT NULL COLLATE UNICODE,
> ZIPCODE VARCHAR( 30) NOT NULL COLLATE UNICODE,
> CITY VARCHAR( 50) NOT NULL COLLATE UNICODE,
> REGION SMALLINT
> );
>
> ALTER TABLE POSTALCODES ADD CONSTRAINT PK_POSTALCODES PRIMARY KEY(ID);
> CREATE GENERATOR POSTALCODES_PRIMARYKEY;
>
> CREATE ASC INDEX SYS_POSTALCODES_ZIPCODE ON POSTALCODES (ZIPCODE);
> CREATE ASC INDEX SYS_POSTALCODES_CITY ON POSTALCODES (CITY);
>
>
>
> 2. case:
> Database with UTF8 as character set has UNICODE_CI_AI as default collation, collumns have no specific collation:
> CREATE TABLE POSTALCODES
> (
> ID INTEGER NOT NULL,
> COUNTRY VARCHAR( 3) NOT NULL,
> ZIPCODE VARCHAR( 30) NOT NULL,
> CITY VARCHAR( 50) NOT NULL,
> REGION SMALLINT
> );
>
> ALTER TABLE POSTALCODES ADD CONSTRAINT PK_POSTALCODES PRIMARY KEY(ID);
> CREATE GENERATOR POSTALCODES_PRIMARYKEY;
>
> CREATE ASC INDEX SYS_POSTALCODES_ZIPCODE ON POSTALCODES (ZIPCODE);
> CREATE ASC INDEX SYS_POSTALCODES_CITY ON POSTALCODES (CITY);
>
>
> The following query (search string can contain letters too theoretically)
>
> select ID, COUNTRY, ZIPCODE, CITY, REGION
> from POSTALCODES
> where (ID > 0) and (upper(ZIPCODE) like '12345%')
>
> is being executed immediately in the first case, but takes almost 2 minutes in the second case...
>
> If I remove the UPPER part, it is being executed immediately.
> If I specify UNICODE as a differing collation for the ZIPCODE column, it is being executed immediately as well.
>
>
> What is wrong with the second case and how can I fix it?

I'm confused ...

If you have a character field with a case-senstive collation, then
querying for UPPER without an function-based index shouldn't use an index.

If you have a character field with a case-insenstive collation, then my
understanding is that you don't have to use UPPER at all, to perform a
case-insensitive query, thus a regular, non-function-based index on the
field with a case-insensitve collation should be sufficient.

--
With regards,
Thomas Steinmaurer
http://www.upscene.com/