Subject Re[2]: [Firebird-Architect] Re: Index structures
Author Daniel Rail
Hello Ann,

Monday, June 9, 2003, 1:54:38 PM, you wrote:

AWH> What Arno and Nicholay were proposing was the ability to
AWH> do a binary search of an index bucket which requires (to
AWH> my mind) that all nodes be the same length. This is not
AWH> a MPEG type problem where decompression is done in real
AWH> time and is costly. Comparing compressed keys is cheaper
AWH> than comparing non-compressed keys. A binary search is
AWH> generally faster than a scan.

AWH> We ought to consider the cost of a second type of index
AWH> before jumping into this.

Ann, thanks for the explanation on how the indices work.

I've been doing some thinking. And, this is just theoretical.

What about having an Index Allocation Table (IAT) for each index. In
the IAT, each node is represented by it's ID, how to skip to the next
node and what is the first entry in the node. So the algorithm
basically stays the same, except it is doing the preliminary search in
the IAT, instead of loading all of the pages/nodes. Once it found in
which node it has to perform a more detailed search, that node is
fetched from the hard-disk, unless it's already cached.

Here's an example of what I'm suggesting:
Let's say that the index contains 100 nodes and the first node is the
IAT(there can be more than one IAT depending on the amount of bytes
required). Let's say that the index entry we are looking for is in
the 50th node. By going through the IAT, it is determined that the
entry is in the 50th node. The engine fetches that node, first it
looks in the cache to see if it's there and if it is use it, and if
it's not then fetch it from disk. Once the node is fetched, perform
the detailed search within the node to find the index entry. In total
only 2 nodes/pages were fetched, the IAT and the node where the entry
is located, compared to 50 nodes with the present algorithm.

I know that there is more to this. Especially if there's as been an
insert or update, that could affect the entries in the nodes. Still
referring to the example above: if the node that is fetched doesn't
contain the index entry, because is was bumped to another node due to
an insert or an update, then fetch the IAT again and start the search
again. Even after performing the algorithm a second time, you're
still only fetching a total of 4 nodes/pages compared to approx. 50.

It's just my 2 cents on the subject.

--
Best regards,
Daniel Rail