firebird-support - Re: [firebird-support] Index analysis - what does mean nodes, dup and max dup?

Subject	Re: [firebird-support] Index analysis - what does mean nodes, dup and max dup?
Author	Ann W. Harrison
Post date	2008-11-20T18:19:25Z

emb_blaster wrote:

>
> I thinking how can I use index statistics to optimize my DB.
> I want know what does mean nodes total dup and max dup and if/how they
> relation for example, take this table and index below:
>

The next line introduces a table, with the RDB$RELATION_ID

> SAIDAPRODUTOS (161)

The next line describes the physical structures that support the
table - the first "pointer page" which contains a list of data
pages and a pointer to the next pointer page, and the page that
describes indexes for the table.

> Primary pointer page: 265, Index root page: 266

The next line says the tables has 178 data pages, and that
the pointer page uses 178 slots - the second number would be
higher if pages had been allocated and released after records
were deleted. The average data page is 80% full, which is
pretty good.

> Data pages: 178, data page slots: 178, average fill: 80%

The next seven lines are a histogram of fill levels on data
pages.

> Fill distribution:
> 0 - 19% = 5
> 20 - 39% = 3
> 40 - 59% = 1
> 60 - 79% = 79
> 80 - 99% = 90
>

The first index is the system generated index that maintains the
primary key. It is index 0 for this table.

> Index RDB$PRIMARY30 (0)

The index depth is 2, meaning there is one top level page in
the index, and some number of lower level pages. A leaf bucket
is jargon for an index page on the lowest level, the level that
points to records. There are 11 pages at the lowest level.
An index node is an index entry - something that points to a
record. This index has 8865 record entries, plus eleven entries
at the next level up.

> Depth: 2, leaf buckets: 11, nodes: 8865

Firebird uses both prefix and suffix compression on index keys.
As a result, the average key length is about one byte. Not
surprisingly, there are no duplicate values. An active index
- one where entries are added, removed, and then added back -
can have duplicates, even if it is unique. No single transaction
will see more than one instance of each value. The magic of
MVCC.

> Average data length: 1.01, total dup: 0, max dup: 0

Again, the fill distribution is a histogram of the space used on
index pages.

> Fill distribution:
> 0 - 19% = 0
> 20 - 39% = 0
> 40 - 59% = 0
> 60 - 79% = 0
> 80 - 99% = 11

The next line describes the next index on the table

>
> Index SAIDAPRODUTOS_IDX1 (1)

The depth is still 2, but there are fewer pages used to hold
the same number of entries, suggesting better compression.
You would expect that if there are duplicate values because
prefix compression totally removes the second and subsequent
duplicate key values.

> Depth: 2, leaf buckets: 10, nodes: 8865

And indeed, the average data length is less than a third of
a byte. Of the 8865 entries, there are only 599 distinct
values (if I can subtract). One value is represented 39 times,
which is the most instances of any particular value. In early
versions of Falcon, long duplicate chains are a performance
problem.

> Average data length: 0.30, total dup: 6266, max dup: 38
> Fill distribution:
> 0 - 19% = 0
> 20 - 39% = 1
> 40 - 59% = 0
> 60 - 79% = 0
> 80 - 99% = 9

Good luck,

Ann