Subject | Re: [firebird-support] Index question |
---|---|
Author | Ann W. Harrison |
Post date | 2010-10-26T22:18:28Z |
On 10/26/2010 5:42 PM, Paul R. Gardner wrote:
If the field you want to index is the primary key, just accept the
ascending index. It will be fine.
either you're looking just at the roots, or you're looking at trunk
and branches, upside down. It's a tree made up of database pages.
You start looking at the top of the tree, which is a single page
that has lots of entries each of which combines a compressed version
of the key value, a record number, and a page number. The page number
is the next index page down the tree that starts with the key value.
Each index page includes a index to itself, so even looking for the
last entry on the page, you never have to read more than 1/4 of the
page.
So, if you have values between 1 and 65536, the top of the tree might
have a 1, a 30000, and a 60000. You're looking for 45000. So you
go down to the page pointed at by 30000 and it contains 30000, 31000,
32005, 33009, etc. You pick up the value 44123 - the next value is
45189 which is too high. You go to that page, use the index on page
to skip to 44950, read fifty entries, find 45000 and the record
number and there you are. Three page reads plus whatever it takes
to find the record by number. The top of the index is almost always
in cache, and the second level is likely to be there too.
an entry to an index requires writing two pages at each index level
plus one for the new top. Consider the three level index above.
If the page for the value you're adding is full, it gets split into
two pages, meaning there has to be a new entry in the next page up
the tree for the new page. If that page is full, it splits, and
a new entry goes in the top page. In the example, the top page
had only three entries, but imagine that it's full too. So it splits,
becoming two pages and Firebird creates a new top level that contains
two entries - one for each of the two pages that hold the entries from
the former top page.
As an aside, normally when an index page splits, half the entries go
to one page and half to the other, so each has room for more entries.
If the entry that causes the page to split would be the last one on the
page, it goes on a page all by itself on the very plausible assumption
that the data is being stored in key order and empty space on earlier
pages isn't useful.
Good luck,
Ann
> Say I have an INVOICE table that has an ID field as a primary key. As IThe primary key declaration automatically generates an ascending index.
> add invoices, a generator gets the next value so the field runs 1 - n.
> Should the index be ascending or descending?
If the field you want to index is the primary key, just accept the
ascending index. It will be fine.
>An index is not a list, it's a tree - and like most computer trees,
> I would think descending is the way to go since users would not lookup
> old invoices (say over 120 days). Therefore a descending index should
> find the invoice I'm trying to lookup faster. After years of data the
> old invoices would not be seen often.
either you're looking just at the roots, or you're looking at trunk
and branches, upside down. It's a tree made up of database pages.
You start looking at the top of the tree, which is a single page
that has lots of entries each of which combines a compressed version
of the key value, a record number, and a page number. The page number
is the next index page down the tree that starts with the key value.
Each index page includes a index to itself, so even looking for the
last entry on the page, you never have to read more than 1/4 of the
page.
So, if you have values between 1 and 65536, the top of the tree might
have a 1, a 30000, and a 60000. You're looking for 45000. So you
go down to the page pointed at by 30000 and it contains 30000, 31000,
32005, 33009, etc. You pick up the value 44123 - the next value is
45189 which is too high. You go to that page, use the index on page
to skip to 44950, read fifty entries, find 45000 and the record
number and there you are. Three page reads plus whatever it takes
to find the record by number. The top of the index is almost always
in cache, and the second level is likely to be there too.
>Yes, I think you are picturing this wrong. The worst case of adding
> On the other hand, since the values go upwards, does the entire index
> need to be rewritten if it's descending? (I'm picturing an index that
> looks something like 5,4,3,2,1. I add #6 to it. Now it needs to be
> 6,5,4,3,2,1 which means that it may have had to shift everything and
> redo the index. Perhaps I'm picturing this wrong!)
an entry to an index requires writing two pages at each index level
plus one for the new top. Consider the three level index above.
If the page for the value you're adding is full, it gets split into
two pages, meaning there has to be a new entry in the next page up
the tree for the new page. If that page is full, it splits, and
a new entry goes in the top page. In the example, the top page
had only three entries, but imagine that it's full too. So it splits,
becoming two pages and Firebird creates a new top level that contains
two entries - one for each of the two pages that hold the entries from
the former top page.
As an aside, normally when an index page splits, half the entries go
to one page and half to the other, so each has room for more entries.
If the entry that causes the page to split would be the last one on the
page, it goes on a page all by itself on the very plausible assumption
that the data is being stored in key order and empty space on earlier
pages isn't useful.
Good luck,
Ann