firebird-support - Gradual decrease in select performance

Subject	Gradual decrease in select performance - Indexes
Author	David Hollings
Post date	2017-12-01T13:33:27Z

Hi all,

We use Firebird 2.5.7 CS 64bit on Centos.
We notice that things slow down gradually over time, eventually calling for
a backup and restore cycle. Thereafter performance returns and slowly
worsens over time.

We know that rebuilding an index will dramatically improve query
performance. An example:

There is a table ITEMS that has about 10 000 inserts per day - very few
deletes.
The life cycle of a record can involve 5-10 updates to fields BRANCH
smallint, STATUS char(1) and USER integer.
There is a compound index IT_STATBRUSR INDEX ON ITEMS(STATUS, BRANCH, USER)
that is the only index used in the query (select count(*) from items where
status = 'A' and branch = 0);

The process was to create a backup of the database and restore to live. The
original database was then used for tests (the idea is that the backup would
have performed garbage collection).
1) gstat -t ITEMS -a
2) isql and run query, set plan on and set stats on then run query again and
gather plan and stats. Then exit isql
3) isql and rebuild IT_STATBRUSR index (using the "alter index IT_STATBRUSR
active;" trick). Then exit isql.
4) gstat -t ITEMS -a
5) isql and run query, set plan on and set stats on then run query again and
gather plan and stats. Finally exit isql.

At all times the database was not being accesses by any other attachment.

Database before index rebuild:

Database header page information:
Flags 0
Checksum 12345
Generation 191679656
Page size 8192
ODS version 11.2
Oldest transaction 186094728
Oldest active 186094729
Oldest snapshot 186094729
Next transaction 186094730
Bumped transaction 1
Sequence number 0
Next attachment ID 5584890
Implementation ID 24
Shadow count 0
Page buffers 512
Next header page 0
Database dialect 3
Creation date Apr 23, 2017 16:05:23
Attributes

Variable header data:
Sweep interval: 0
*END*

ITEMS (144)
Primary pointer page: 351, Index root page: 352
Data pages: 243089, data page slots: 243139, average fill: 87%
Fill distribution:
0 - 19% = 920
20 - 39% = 2363
40 - 59% = 14652
60 - 79% = 7624
80 - 99% = 217530

Index IT_STATBRUSR (6)
Depth: 3, leaf buckets: 6073, nodes: 6070960
Average data length: 0.02, total dup: 6067054, max dup: 161688
Fill distribution:
0 - 19% = 1
20 - 39% = 169
40 - 59% = 3446
60 - 79% = 514
80 - 99% = 1943

SQL> select count(*) from items where status = 'A' and branch = 0;

PLAN (ITEMS INDEX (IT_STATBRUSR))

COUNT
============
1118

Current memory = 6856144
Delta memory = -22584
Max memory = 6931776
Elapsed time= 0.46 sec
Cpu = 0.00 sec
Buffers = 512
Reads = 95055
Writes = 376
Fetches = 357969

After rebuilding the IT_STATBRUSR index:

Database header page information:
Flags 0
Checksum 12345
Generation 191679667
Page size 8192
ODS version 11.2
Oldest transaction 186094733
Oldest active 186094734
Oldest snapshot 186094734
Next transaction 186094737
Bumped transaction 1
Sequence number 0
Next attachment ID 5584893
Implementation ID 24
Shadow count 0
Page buffers 512
Next header page 0
Database dialect 3
Creation date Apr 23, 2017 16:05:23
Attributes

Variable header data:
Sweep interval: 0
*END*

ITEMS (144)
Primary pointer page: 351, Index root page: 352
Data pages: 243089, data page slots: 243139, average fill: 87%
Fill distribution:
0 - 19% = 931
20 - 39% = 2483
40 - 59% = 15387
60 - 79% = 8065
80 - 99% = 216223

Index IT_STATBRUSR (6)
Depth: 3, leaf buckets: 3714, nodes: 5574070
Average data length: 0.01, total dup: 5570658, max dup: 161078
Fill distribution:
0 - 19% = 0
20 - 39% = 0
40 - 59% = 1
60 - 79% = 0
80 - 99% = 3713

SQL> select count(*) from items where status = 'A' and branch = 0;

PLAN (ITEMS INDEX (IT_STATBRUSR))

COUNT
============
1118

Current memory = 5226320
Delta memory = 0
Max memory = 5270656
Elapsed time= 0.00 sec
Cpu = 0.00 sec
Buffers = 512
Reads = 1168
Writes = 0
Fetches = 2427

So the number of nodes in the index decreased from 6070960 to 5574070 and
the number of leaf buckets from 6073 to 3714.
The query after index rebuild is significantly faster with 1.22% of the page
reads and 0.68% of the fetches.

After two weeks of use the performance decreases to ~0.1 seconds with ~11500
reads and ~57000 fetches. Index statistics look as follows:

Index IT_STATBRUSR (6)
Depth: 3, leaf buckets: 4583, nodes: 5752544
Average data length: 0.01, total dup: 5748694, max dup: 163232
Fill distribution:
0 - 19% = 0
20 - 39% = 18
40 - 59% = 1282
60 - 79% = 302
80 - 99% = 2981

My question is therefore:
Is this expected for such a scenario where records are inserted and updated
quite often?
I can understand that leaf buckets would increase and fill ratios would
worsen as records are added and modified, pages split and merged; but I
cannot understand why page reads would increase 10 fold over two weeks and
nearly 90 fold over a few months.
Have index pages really become that fragmented due to inserts and updates?

I look forward to hearing your thoughts.

Thanks and regards,
David