Subject Indexes, Multi-Generations, and Everything
Author Jim Starkey
The question as been raised, most recently in Jason's opus, how
indexes are managed in InterBase with regard to the multi-generation
architecture. Ignoring for the time all aspects of the actual
index structure, here's the scoop.

Indexes in InterBase are always considered noisy. There is no
practical way to guarentee that an index entry exists for a
non-committed record in the face of a database crash, so the
engine doesn't particularly worry about it. Since the last
disk write when committing a transaction is the TIP (transaction
inventory page), it is guarenteed that a committed record will
indeed have an index entry.

Because of the multi-generation architecture, unique indexes
do indeed have duplicate values. However, the engine does
guarentee that no two committed records will have duplicates
values. More about this later.

When a record is inserted, the engine will make an index entry
for each index defined on the table. When a record is modified,
however, the engine will search the entire chain of back versions
looking for a record with an duplicate value. If (and only if)
it doesn't find a duplicate, it adds an index entry for the index.

Garbage collection is similar to modify, but a litte more complication.
The engine first fetches all old versions, creating two lists:
Records leaving and record staying. Each unique value in the leaving
list not represented in the staying list is removed from the index.

The bottom line is that no index should ever have two identical
entries (same value, same record number). That said, the engine
still checks for indentical entries when inserting an entry, just
in case. If it finds an existing identical node, it very quietly
forgoes adding the redundant value. Similarly, when removing
index entries, it doesn't get upset if the entry can't be found.
Life goes on.

Index retrievals are generally done by screaming through the root
index pages (they are linked horizontally, left to right) using
record numbers to set bits in a structure called a sparse bitmap.
When it is done with the scan, the bitmap is combined with any
other bitmaps (bitwise and or or as appropriate). The engine
then walks the bitmap in record number order looking for candidates.
When a record number is found, the record version appropriate for
the transaction is found. Before record can be accepted, however,
the full record selection expression boolean is applied. Without
the boolean test, there is absolutely no guarentee that the target
record actually had the value represented in the index.

Enforcing unique indexes is a tad tricky but works like this:

1. The record is stored.
2. For each unique index on the table, an index scan is
performed using the new record's value using the algorithm
described above.
3. If a record with a bona fide duplicate value is detected,
there is trouble in River City. Because we have the
offending record, we also have the offending transaction's
id. The engine tries to get a lock on that transaction id.
As long as that transaction is alive, it holds the lock, so
the engine stalls waiting for the other transaction to
complete (in the superserver only the thread stalls). When
the engine gets the lock it refetches the record using normal
transaction rules. If the duplicate values is there and
committed, there is no alternative but to backout the verb
(including our candidate record) and report an error back
to the client.

Jim Starkey