Subject | Re: [firebird-support] Firebird 2.0 Indexing |
---|---|
Author | Ann W. Harrison |
Post date | 2005-06-06T17:40:42Z |
Svend Meyland Nicolaisen wrote:>>
indexes and the amount of maintenance they require. Here's why.
Size matters a lot in index performance - there's less information per
page with larger entries and the index depth increases.
An index entry currently consists of eight bits of prefix, eight bits of
length, the compressed key value, and a 32 bit value which is a page
number (for upper levels of the index) or a record number (for the
bottom level). Record numbers are 40 bits in V2 but the compression is
better, so the net is smaller. Transaction ids are 32 bit values and
you need two of them - one to say what transaction created the value and
one to say which transaction superseded the value. Adding two
transaction ids to each index key nearly triples the "overhead" size of
an index entry.
But that's no all. If you have an indexed field that switches between
two values, it is currently represented in the index by one entry for
each value - not one entry for each version. For example, transaction 1
creates the record and stored 'ABC' in the indexed field. Transaction 2
updates the record, changing 'ABC' to 'DEF'. The index now has two
entries for the same record, 'ABC' and 'DEF'. Transaction 3 updates the
record again, setting the field's value back to 'ABC'. Since there's
already an index entry for 'ABC' for this record, nothing gets added to
the index.
If we kept transaction information in the index, we would need two 'ABC'
entries for that record, one for the 'ABC' created by Transaction 1 and
superseded by transaction 2, and a second for the one created by
Transaction 3 and not yet replaced. So instead of two entries with 6
bytes of overhead, we now have three entries with 14 bytes of overhead each.
Additional index entries mean more maintenance.
When transaction 2 updated the record, it would have to modify the old
index entry in addition to inserting its new index entry.
When the record version created by transaction 1 is garbage collected,
there's no need to change the index, because there is still a valid
'ABC' in the record version chain. If the index entry were tagged with
the ids of the transactions that created and obsoleted it, the garbage
collection would have to include removing the older of the two 'ABC'
entries.
Regards,
Ann
>Yes, it would be possible, but would greatly increase the size of
>
> Wouldn't it be possible to add transaction information to the index entries
> so that the engine doesn't have to lookup the record for validation?
>
indexes and the amount of maintenance they require. Here's why.
Size matters a lot in index performance - there's less information per
page with larger entries and the index depth increases.
An index entry currently consists of eight bits of prefix, eight bits of
length, the compressed key value, and a 32 bit value which is a page
number (for upper levels of the index) or a record number (for the
bottom level). Record numbers are 40 bits in V2 but the compression is
better, so the net is smaller. Transaction ids are 32 bit values and
you need two of them - one to say what transaction created the value and
one to say which transaction superseded the value. Adding two
transaction ids to each index key nearly triples the "overhead" size of
an index entry.
But that's no all. If you have an indexed field that switches between
two values, it is currently represented in the index by one entry for
each value - not one entry for each version. For example, transaction 1
creates the record and stored 'ABC' in the indexed field. Transaction 2
updates the record, changing 'ABC' to 'DEF'. The index now has two
entries for the same record, 'ABC' and 'DEF'. Transaction 3 updates the
record again, setting the field's value back to 'ABC'. Since there's
already an index entry for 'ABC' for this record, nothing gets added to
the index.
If we kept transaction information in the index, we would need two 'ABC'
entries for that record, one for the 'ABC' created by Transaction 1 and
superseded by transaction 2, and a second for the one created by
Transaction 3 and not yet replaced. So instead of two entries with 6
bytes of overhead, we now have three entries with 14 bytes of overhead each.
Additional index entries mean more maintenance.
When transaction 2 updated the record, it would have to modify the old
index entry in addition to inserting its new index entry.
When the record version created by transaction 1 is garbage collected,
there's no need to change the index, because there is still a valid
'ABC' in the record version chain. If the index entry were tagged with
the ids of the transactions that created and obsoleted it, the garbage
collection would have to include removing the older of the two 'ABC'
entries.
Regards,
Ann