Subject Re: [firebird-support] Re: High CPU consumption
Author Ann W. Harrison
kapsee wrote:
> Thanks for the detailed response. When will version 2 be available ?
> I am on 1.5 currently.

V2.0 will be released when it's finally judged ready. There is an alpha
release available. However, you can identify and fix the indexes that
are causing problems. Run gstat with the -r and -a switches, directing
the output to a file.

The results will include some header information, then a description of
each table. Most of the information is specialized and might be of
interest under circumstances I can't really imagine. However, here is
the interesting stuff.

Here's the first part of a table entry. It's followed by fill
distribution which is of great interest to somebody, I suppose, but no
practical value to you.

CONTACT (131)
Primary pointer page: 553, Index root page: 554
Average record length: 50.91, total records: 3495877
Average version length: 9.00, total versions: 10307, max versions: 2345


The first line is the table name (CONTACT) and the internal table id.
Nothing much you can do with that. The second line is also internal
information.

The third line says that the average length of a compressed record in
the table is 50.91 bytes - which may be interesting in general but isn't
pertinent for this discussion.

The fourth line does have some interesting information. The first part
is the length of a back version - not interesting. The second part
(total versions) is the number of back versions of records. Since its
over 10 thousand, it suggests that garbage collection isn't going well
for this table. The final entry on the line - which probably wraps for
you - is called max versions and it is the maximum number of back
versions for any single record in the table. In this (fictitious) case
there is one record that has been modified two thousand three hundred
and forty five times since it was garbage collected. That suggests that
the application is doing something odd ... perhaps deliberately using
the record as a gateway to protect something else ... but distinctly
worth looking at.

Following the table entry are a series of entries describing all the
indexes defined for the table. Here's an entry for a very suspect
index. Again, I've left off the fill distribution as not being very
interesting.

Index CONTACT_IDX1 (5)
Depth: 3, leaf buckets: 5430, nodes: 3505877
Average data length: 0.00, total dup: 2635903, max dup: 2005297

The first line is the name of the index and the internal index id - the
scope of the id is the table...

The first item of the first line of detail is important. The depth of
an index should be 3 or less. To decrease the depth of indexes, you
must backup the database and restore it with a larger page size. The
rest of the line describes the width of the index - not interesting -
and the number of "nodes" - a node is an index entry. You'll often find
that the number of nodes is higher than the number of records in the
table. That happens when there are back versions that have separate
index entries.

The second line is also interesting. The average data length is the
length of the data portion of the index key after compression. Zero
looks like a good number but it usually means that there are lots of
duplicates in the index. The "total dup" is the total number of
duplicated entries in the index. That number can be quite large without
affecting performance, as long as the duplicates are distributed across
all the entries. The "max dup" number is the real red flag. That's the
largest number of instances of a single value in the index. In this
case, it's two million five thousand two hundred and ninety seven.

When you find an index with a "max dup" number greater than about 30
thousand, you've found a candidate for replacement. If records are
deleted from the table or modified to a different key value, that index
will cause slow, CPU intensive garbage collection. But you can fix it.
First, find the index definition and the table definition. Then drop
the index. Finally, recreate the index with its existing keys first,
followed by they primary key field(s) from the table. The index will
continue to work (nearly) as well as it did before, and will no longer
cause problems for the garbage collector.

Or, you can wait for Version 2.


Regards,


Ann