Subject Index architecture
Author Sean Winstead
Hi,

I am interested in learning how InterBase handles indexes. Up front, I must say that I am ignorant about InterBase and how it does things so bear with me.

>> For the record, an Interbase index node consists of the following:

>> 1 byte of prefix length
>> 1 byte of key length (excluding prefix)
>> 4 bytes of record number
>> variable length key

Is InterBase limited to 2^32-1 records in a table (i.e., 4 bytes for record number)?

What does the prefix length mean? The following snippet gives me a clue but a straightforward definition would help me.

>> Second, the prefix compression scheme allows most nodes to be
>> skipped based on prefix length. For example, if the last
>> key comparison was unequal in the second byte, all node with
>> a prefix of two or greater can be skipped without further
>> ado. This translates into an average number of memory references
>> per node somewhere between two and three (one for the prefix,
>> one possibly for a key byte, and one for the key length to
>> find the next node). I'm quite sure it can't be made faster.


>> First, as you may already know, the keys are pre-mulched so that
>> keys can be compared as unsigned byte streams without regard to
>> type, length, or single/multiple field (there's an interesting
>> trick if anybody's interested). The idea was to remove all
>> complexity from the page scan code to make the inner loop as
>> fast as possible.

I am interested in learning more about this.


Sean