Subject | Re: [Firebird-Architect] Re: Alternate way to use indexes with poor selectivity |
---|---|
Author | Ann Harrison |
Post date | 2012-01-03T17:37:03Z |
Hi Vlad,
be read when there's decent selectivity in one index and horrible
selectivity in the others. Currently, the alternatives are to read
all the indexes, creating bitmaps of the matching record numbers and
combine the bitmaps. That's bad if some of the indexes have a
selectivity of 1:5 or 1:10 because it builds huge bitmaps. On the
other hand if you've got a plausible set of record numbers from the
first index, reducing the number of records read by a factor of 5 or
10 by probing other indexes should be a performance gain.
horrible selectivity in the others. I'm assuming that Firebird can
tell the difference
between a 1:10000 index and a 1:5.
statement of the case and hadn't thought through the problem of ranges
on the secondary indexes. When there's an equality match in the
secondary indexes, the record number is either there or not. For
ranges in secondary indexes it's a bit harder.
For those who've spent less time than Vlad thinking about the format
of Firebird "b-trees" and the ODS:
Indexes identify a record by record number. All versions of a record
have the same record number. If different versions of a record have
different values for an index key, the record number will appear in
the index once for each unique value of the key. If you have an
invariant key, there's one index entry for all versions of the record.
If you have a key that changes, there's one entry for each value.
All Firebird indexes are effectively unique if you consider the key a
combination of the indexed value and the record number. All
duplicates on the key value are stored in order by record
number, so the combination of record number and value is easy to look
up. That's necessary to limit the cost of index garbage collection.
So a query like this:
select first 200 skip 0 property_picture from properties
where city = 'NYC' and
bedrooms = 3 and
price between 100000 and 200000
order by price ascending
could be resolved by walking the price index starting at 100000 then
looking up the values of city and bedrooms in their indexes by value
and record number. If the record number shows up in cities under NYC
and bedrooms under 3, then it's worth reading. The actual record may
not match because the value is not in the current version (though few
properties move out of New York or change their number of bedrooms).
The harder case is this:
select first 200 skip 0 property_picture from properties
where city = 'NYC' and
bedrooms between 3 and 5 and
price between 100000 and 200000
order by price ascending
A very smart (probably dangerously smart) compiler/optimizer could
rewrite that as:
select first 200 skip 0 property_picture from properties
where city = 'NYC' and
bedrooms in (3, 4, 5) and
price between 100000 and 200000
order by price ascending
Real ranges in the secondary indexes is harder but not as hard as
handling ranges in compound keys.
select first 200 skip 0 property_picture from properties
where city = 'NYC' and
square_area between 500 and 1000 and
price between 100000 and 200000
order by price ascending
In this case, it makes sense to walk the index on price from 100000
to 200000 and dip into the index on city using the key 'NYC' and the
record numbers that show up. That alone would avoid a lot of record
reads at the cost of keeping a couple of levels of the city index in
memory.
But what to do about the index on square_area?
One choice is to create the usual bitmap on the first reference and
check record numbers in it as the index walk on price returns them.
That has the current problem of creating a huge bitmap for a
non-selective index and keeping it around. Another, messier
alternative might be to dip into the index for each record number,
something like this:
Look for price = 100000 and record number = 2376347. If that pair
doesn't exist, read across the leaf level of the index until price >
100000, then go back to the top of the index and look for
price = <next value> and record number = 2376347. That sounds a lot
like the old index thrashing that caused garbage collection to be so
very slow.
I think there's a real case to be made for dipping into non-selective
indexes when there's an equality match and there may be a case for
using them in ranges.
Best regards,
Ann
> I.e. you offer to replace one index scan by many index lookup's ?Not exactly. I'm trying to reduce the number of records that need to
be read when there's decent selectivity in one index and horrible
selectivity in the others. Currently, the alternatives are to read
all the indexes, creating bitmaps of the matching record numbers and
combine the bitmaps. That's bad if some of the indexes have a
selectivity of 1:5 or 1:10 because it builds huge bitmaps. On the
other hand if you've got a plausible set of record numbers from the
first index, reducing the number of records read by a factor of 5 or
10 by probing other indexes should be a performance gain.
>The problem I'm trying to solve is having decent selectivity in one index and
> It could be good (if we have small primary bitmap and a lot of entries
> in secondary index satisfied search criteria) or bad (if we have opposite
> case). I'd said we can use min and max record numbers from primary bitmap
> to limit scanned number of secondary index entries, but it could be not so
> effective...
horrible selectivity in the others. I'm assuming that Firebird can
tell the difference
between a 1:10000 index and a 1:5.
>Yes, I do remember record versions, but was not completely clear in my
>
>> That's a unique key. If it's found, keep that
>> record number in the primary bitmap, if not, drop it.
>>
>
> Sorry, we shouldn't clear record numbers in primary bitmap as there could
> be another index entry at the secondary index with the same recno but
> different key (remember record versions ?).
>
statement of the case and hadn't thought through the problem of ranges
on the secondary indexes. When there's an equality match in the
secondary indexes, the record number is either there or not. For
ranges in secondary indexes it's a bit harder.
For those who've spent less time than Vlad thinking about the format
of Firebird "b-trees" and the ODS:
Indexes identify a record by record number. All versions of a record
have the same record number. If different versions of a record have
different values for an index key, the record number will appear in
the index once for each unique value of the key. If you have an
invariant key, there's one index entry for all versions of the record.
If you have a key that changes, there's one entry for each value.
All Firebird indexes are effectively unique if you consider the key a
combination of the indexed value and the record number. All
duplicates on the key value are stored in order by record
number, so the combination of record number and value is easy to look
up. That's necessary to limit the cost of index garbage collection.
So a query like this:
select first 200 skip 0 property_picture from properties
where city = 'NYC' and
bedrooms = 3 and
price between 100000 and 200000
order by price ascending
could be resolved by walking the price index starting at 100000 then
looking up the values of city and bedrooms in their indexes by value
and record number. If the record number shows up in cities under NYC
and bedrooms under 3, then it's worth reading. The actual record may
not match because the value is not in the current version (though few
properties move out of New York or change their number of bedrooms).
The harder case is this:
select first 200 skip 0 property_picture from properties
where city = 'NYC' and
bedrooms between 3 and 5 and
price between 100000 and 200000
order by price ascending
A very smart (probably dangerously smart) compiler/optimizer could
rewrite that as:
select first 200 skip 0 property_picture from properties
where city = 'NYC' and
bedrooms in (3, 4, 5) and
price between 100000 and 200000
order by price ascending
Real ranges in the secondary indexes is harder but not as hard as
handling ranges in compound keys.
select first 200 skip 0 property_picture from properties
where city = 'NYC' and
square_area between 500 and 1000 and
price between 100000 and 200000
order by price ascending
In this case, it makes sense to walk the index on price from 100000
to 200000 and dip into the index on city using the key 'NYC' and the
record numbers that show up. That alone would avoid a lot of record
reads at the cost of keeping a couple of levels of the city index in
memory.
But what to do about the index on square_area?
One choice is to create the usual bitmap on the first reference and
check record numbers in it as the index walk on price returns them.
That has the current problem of creating a huge bitmap for a
non-selective index and keeping it around. Another, messier
alternative might be to dip into the index for each record number,
something like this:
Look for price = 100000 and record number = 2376347. If that pair
doesn't exist, read across the leaf level of the index until price >
100000, then go back to the top of the index and look for
price = <next value> and record number = 2376347. That sounds a lot
like the old index thrashing that caused garbage collection to be so
very slow.
I think there's a real case to be made for dipping into non-selective
indexes when there's an equality match and there may be a case for
using them in ranges.
Best regards,
Ann