Subject Re: [firebird-support] Re: Updating index statistics
Author Ann W. Harrison
lacakus wrote:
> Can I achieve "updating statistics" by :
> alter index <indexname> inactive;
> alter index <indexname> active;

Yes, but that is harder to do on-line than just setting the statistics.
Since the index is unavailable while it's being rebuilt, the rebuild
can't be done while there are prepared queries that depend on it. So
rebuilding indexes is (more or less) an off line operation.
>
> In docs says "Reactivating a deactivated index rebuilds and rebalances
> an index."

Yes, that's true, but no, it's probably not nearly as important as you
think it is.

Many of us had introductory data structure courses that showed an index
design where the index developed was variable depth ... some branches
got really long, others were very short. Access time depended on the
branch and "rebalancing" to make all the branches the same length was
critical to performance. In the real world, nobody is that dumb.

In Firebird, an index is made up of pages - the same size as all pages
in the database. It starts as a single page. Index entries (called
nodes) are added to the page in order by key. When the page fills,
two other index pages are added.

The first one is on the same level - it will be the "right sibling"
and will contain higher key values than the existing page. Ordinarily,
about half the contents of the original page are copied to the new page.
However, if the new node would have been the last node on the old page,
only it goes on the new page.

The second new page is at the next higher level. It will initially
contain two nodes - one pointing to the first of the lower level pages
and one pointing to the second. Each of those nodes will also contain
the key value of the first node on the page it points to, plus the
record number associated with that key value.

The index continues to build in that fashion - when a page overfills,
it splits and create a new right sibling. The next level up gets a
new node pointing to the new page.

When gstat reports that an index has three levels, it means that there
is one top page, one level of intermediate pages, and one level of pages
with nodes that point to records.

When nodes are deleted from an index page, Firebird checks that the fill
level on the page is more that .25 of the page. If it is not, and if
there is enough space on the left sibling page, then the contents of the
mostly empty page are moved to the left sibling and the parent pointers
are fixed to eliminate the now empty page. The page is released to the
free space area and reused as necessary.

Index garbage collection doesn't matter much for indexes on data
that is added and deleted in random order, but it's very useful
in the case where keys from new records have higher values than
keys from old records and old records are deleted. Without index
clean up, the left hand side of the index becomes "hollow" and the
index can be deeper than necessary because of the intermediate
level pointers to empty pages.

There are a number of limits on garbage collection. The most important
is that it will never eliminate a whole level. If you have a three
level index and delete all the data, it will continue to be a three
level index with one empty page on each level. So, if you actually
delete all the data from a table, deactivate and reactivate the indexes.
Nobody else is using that table anyway.

Another limit is that the left most pages at each level never go away.
That's an obvious result of collapsing to the left, and also a clean
way to avoid losing a whole level.

Summary:

A Firebird index never goes "out of balance" like an index built top
down with variable length branches.

Firebird collapses strings of mostly empty index pages, so indexes
don't develop "hollow" areas where the upper level points to empty
pages.

A freshly built index will be more dense than an index built in random
order, but not more dense than an index where each new key entered is
larger than all existing entries (e.g. a primary key with a generator).
If entries are made and removed in random order, having a bit of free
space on index pages reduces page splits, which is a good thing.



Regards,


Ann