Subject Firebird 3.0.4 unicode_ci_ai index problems
Author Luis Forra


I'm using Firebird 3.0.4 x64 in Linux and Windows, Superserver

The databases that I have migrate to utf8 with colation unicode_ci_ai are much slower in use, the problem is the indexes with various varchar fields.

example of the problem

data:

CREATE TABLE TEST_UNICODE (
    S1  VARCHAR(10) NOT NULL COLLATE UNICODE,
    S2  VARCHAR(10) NOT NULL COLLATE UNICODE
);


CREATE TABLE TEST_UNICODE_CI_AI (
    S1  VARCHAR(10) NOT NULL COLLATE UNICODE_CI_AI,
    S2  VARCHAR(10) NOT NULL COLLATE UNICODE_CI_AI
);


INSERT INTO TEST_UNICODE (S1, S2) VALUES ('A', 'A');
INSERT INTO TEST_UNICODE (S1, S2) VALUES ('A', 'B');
INSERT INTO TEST_UNICODE (S1, S2) VALUES ('B', 'A');
INSERT INTO TEST_UNICODE (S1, S2) VALUES ('B', 'B');

COMMIT WORK;

INSERT INTO TEST_UNICODE_CI_AI (S1, S2) VALUES ('A', 'A');
INSERT INTO TEST_UNICODE_CI_AI (S1, S2) VALUES ('A', 'B');
INSERT INTO TEST_UNICODE_CI_AI (S1, S2) VALUES ('B', 'A');
INSERT INTO TEST_UNICODE_CI_AI (S1, S2) VALUES ('B', 'B');

COMMIT WORK;

CREATE INDEX TEST_UNICODE ON TEST_UNICODE (S1, S2);
CREATE INDEX TEST_UNICODE_CI_AI ON TEST_UNICODE_CI_AI (S1, S2);

COMMIT WORK;

Query:

SELECT S1,S2 FROM test_unicode WHERE S1 = 'B' AND S2 = 'A'
UNION ALL
SELECT S1,S2 FROM test_unicode_ci_ai WHERE S1 = 'B' AND S2 = 'A'

Isql Plan:

Select Expression
    -> Union
        -> Filter
            -> Table "TEST_UNICODE" Access By ID
                -> Bitmap
                    -> Index "TEST_UNICODE" Range Scan (full match)
        -> Filter
            -> Table "TEST_UNICODE_CI_AI" Access By ID
                -> Bitmap
                    -> Index "TEST_UNICODE_CI_AI" Range Scan (partial match: 1/2)

I get 2 indexed reads with test_unicode_ci_ai and 1 indexed read with test_unicode, with millions of records the problem escalates.

What I'm doing wrong ? Thank you