Subject Re: [IB-Architect] index key question and garbage collecting
Author Jim Starkey
At 06:52 PM 2/15/01 +0300, Dmitry Kuzmenko wrote:
>
>About a week ago I've found (for myself) why garbage collecting index keys
>takes so much time. Looking into ODS.H:
>
>
>In this case, if somebody deleted many records from table with non-unique
index,
>and there are many duplicate keys in this index, IB needs to
>(at least when select count(*) is executed)
>

>1. find record version that must be garbage collected
>2. pick up ALL keys from index, that
>3. delete key version that corresponds to the garbage collected record
>4. go to the point 1 if there is a next record needs to be garbage collected
>

Index garbage collection works this way:

1. For each record to be garbage collected a linked list
representing all versions is computed.

2. For each index defined on that table loop throught
the record versions to be deleted. If any version
has a value for the index not represented in the
list of records staying, remove those index entries
from the index.

3. Do about the same thing for blob pointers.

3. Zap the back pointer of the oldest surviving record.

4. Clean up the now orphan back versions on disk.

(I don't actually remember whether the data pages or index/blob
garbage collection happens first).

>Why I'm talking about it: other servers (Oracle and MS) uses index scan to
>calculate record count. It is easy way even for
>select count(*) from sometable, because there are less pages in index than
>in table, and scanning index pages will be much faster than scanning data
pages.
>But, you see, index key does not have transaction id, and IB can't understand
>what key versions can be seen by some transaction.
>Slow garbage collection is another point of the same issue, I think.
>

The multi-generational nature of Firebird requires that the record
be visited to determine its state relative to your transaction, so
nothing is gained by walking the index rather than the pointer pages.
(Actually a minor de-optimization.)

>The long story, the short question: what about including transaction id into
>BTN structure? Is it possible? Right now I see only the following cases to
>speed up with transaction id in a key:
>1. select count
>2. index garbage collection
>3. group by (when index is used).
>
>What do you think?
>

I don't think it buys you anything at all.

Jim Starkey