Subject Re: [Firebird-devel] Re: [IB-Architect] Working of indices
Author Ann W. Harrison
At 12:06 AM 9/6/2002 +0200, Arno Brinkman wrote:

>But how does the engine know when i do
>"... WHERE FirstName = 'a' " that he must convert
>(using the collate from FirstName) the value 'a' to the
>right representation for index-match ?

I believe that you have to cast the 'a' to the right
character set and collation. It may be that the
engine notices that FirstName has a specific collation
and character set, so it can automatically interpret
'a' correctly.

> > Dates are also stored as double precision
> > numbers and manipulated the same way.
>Do you understand "TimeStamp, Date and Time" under "Dates" ?

Yes. I believe that all those are indexed as floating point

>I'll explain it my way to see if i understand it.
>All indices (from a specific relation) are put as index-nodes
>in an Index Root Page (=IRP further). The index-nodes
>inside the IRP contains information such as page-pointer,
>unique, descending, primary, foreign, etc...

Also key fields and active or inactive. I'd tend to call
those records on the index root page. There's only one
index root page for each table. When a connection
references a table, it gets a read lock on the index
root page. To add, activate, deactivate, or drop an
index, a connection must get a write lock on the index
root page. Acquiring the write lock notifies all the
connections with read locks that a change is about to
happen. This keeps someone from deleting an index
that is in use and allows indexes to be added and
dropped on a live database.

>The page-pointer points to a Index B-Tree node Page


>Can there be more levels as 0 or 1 ?

Yes - there can be any number of levels, though
generally performance suffers with levels higher
than 3 or 4.

>So yes, when are higher levels made ?

Time to talk about bucket splits a bit more. When a new
index entry is put into an index page, everything after
that entry slides down. The next entry after the new one
may need to be recompressed. Eventually, the page becomes
full and it's time for a bucket split.

The engine allocates a new page to the index and that page
is formatted as a level 0 index page. Half the records from
the old page are copied to the new page and the new page is
spliced in between the old page and its old right sibling,
so the old page points to the new page and the new page
points to the right sibling. The first key value on the
new page is propagated up to the next level.

What happens if the new value to the next level causes that
page to overflow? The upper level bucket splits. The top
most level always contains exactly one page. If it splits,
the engine creates a new top level node, which becomes the
new level.

>What information does the data-record pointer contain.
>How does the engine know in what data-page/node to look ?

A record identifier consists of three parts.

The first is the sequence number of a pointer page for
the table. There's a table called rdb$pages which includes
all the pointer pages for each table. A pointer page is a
database page that contains nothing but a header and an
array of page numbers. Using the relation id, the
sequence number, and the page type, the engine looks up
the pointer page number and reads that page.

The second part of the record identifier is the offset on
the pointer page that holds the data page number. The engine
reads that value, then reads the corresponding data page.

The third part of the record identifier is the offset of
the index entry for the record on page. A data page consists
of a header, a variable number of index entries, and data.
The data is written from the bottom of the page up; the
index entries are written from the top down. When there's
less than sixteen bytes between the two, the page is full.
An index entry is just the offset and length of the record
on page. So, a record id of 4:67:2 indicates that the path
to the record is through the 68th entry on the fifth pointer
page for the table, then the third index on that page.


We have answers.