Subject | Re: [firebird-support] Rebuilding primary key index? |
---|---|
Author | Ann W. Harrison |
Post date | 2010-11-09T20:44:31Z |
On 11/9/2010 1:07 PM, Leyne, Sean wrote:
the reasons. Firebird indexes do not get unbalanced at all.
Selectivity has little or nothing to do with index balance.
Primary key indexes - and unique indexes - do not suffer
from bad selectivity values because their selectivity is
known a priori. Other indexes created on an empty table
will have incorrect selectivities after data is loaded and
should at a minimum have the selectivity set after a large
load or other operation so the stored selectivity matches
the actual data.
Now for my contention that Fire indexes don't get unbalanced.
The classic Database 101 b-tree adds layers going downward
so the path to one record from the top of the tree is say
5 layers while the path to another record is 15 layers. That's
why databases don't use simple b-trees.
In the description below, I'm ignoring jump nodes which are
important but not significant to the discussion.
A Firebird index starts as a single page containing pairs of
values and record identifiers. Eventually, someone wants to
add another pair and it doesn't fit on the page. Firebird
allocates a new page, copies half the pairs into the new page*,
then allocates a second page with two entries, each consisting
of a value, a record identifier, and a page number. The page
numbers are the original index page and the new index page.
The values and record numbers are the first pairs from the
lower pages. The the system goes along, happily filling
bottom level pages, splitting them as necessary, and adding
new pages to the top page. Eventually, the top page fills,
splits, and creates another top page with two entries, one
for the first half of the former top page and one for the
second.
If you think about that for a second, you'll realize that
the length from the top to every lowest level entry is exactly
the same.
Oh, but what about deleted records. Won't I have an unbalanced
index if my key is always increasing with time and I delete the
old records. The answer is no, because Firebird does almost
exactly the reverse when pages get below 1/3 full - it combines
them and removes the pointer from the next level up. That
turns out to be very hard and has taken years of Vlad's life
to make work under load as pages are added to and removed from
indexes.
Best regards,
Ann
*If the new entry comes after the last one that fit on
the last page, only the new entry goes on the new page.
>Sean, I agree with your conclusion, but have trouble with
> Why do you need to rebuild it?
>
> Unlike user indexes, primary key indexes have a fixed selectivity which means that they never become "unbalanced" -- so they never need to be re-indexed.
>
the reasons. Firebird indexes do not get unbalanced at all.
Selectivity has little or nothing to do with index balance.
Primary key indexes - and unique indexes - do not suffer
from bad selectivity values because their selectivity is
known a priori. Other indexes created on an empty table
will have incorrect selectivities after data is loaded and
should at a minimum have the selectivity set after a large
load or other operation so the stored selectivity matches
the actual data.
Now for my contention that Fire indexes don't get unbalanced.
The classic Database 101 b-tree adds layers going downward
so the path to one record from the top of the tree is say
5 layers while the path to another record is 15 layers. That's
why databases don't use simple b-trees.
In the description below, I'm ignoring jump nodes which are
important but not significant to the discussion.
A Firebird index starts as a single page containing pairs of
values and record identifiers. Eventually, someone wants to
add another pair and it doesn't fit on the page. Firebird
allocates a new page, copies half the pairs into the new page*,
then allocates a second page with two entries, each consisting
of a value, a record identifier, and a page number. The page
numbers are the original index page and the new index page.
The values and record numbers are the first pairs from the
lower pages. The the system goes along, happily filling
bottom level pages, splitting them as necessary, and adding
new pages to the top page. Eventually, the top page fills,
splits, and creates another top page with two entries, one
for the first half of the former top page and one for the
second.
If you think about that for a second, you'll realize that
the length from the top to every lowest level entry is exactly
the same.
Oh, but what about deleted records. Won't I have an unbalanced
index if my key is always increasing with time and I delete the
old records. The answer is no, because Firebird does almost
exactly the reverse when pages get below 1/3 full - it combines
them and removes the pointer from the next level up. That
turns out to be very hard and has taken years of Vlad's life
to make work under load as pages are added to and removed from
indexes.
Best regards,
Ann
*If the new entry comes after the last one that fit on
the last page, only the new entry goes on the new page.