Subject | Re: [firebird-support] High selectivity |
---|---|
Author | Helen Borrie |
Post date | 2003-09-26T12:32:36Z |
At 02:06 PM 26/09/2003 +0200, you wrote:
"highness" corresponds with "proximity to uniqueness". So the index with
the highest possible selectivity is the single-column unique one. The one
with the lowest possible selectivity is the two-phase switch.
(or most) rows in the table having one value - that causes selectivity to
be statistically low. I recall a poor fellow in this list who had an app
with an indexed column COUNTRY on a table with millions of rows. Now,
there are something like 300 + country codes worldwide; the problem was,
all of the records in this table had the same country code! He got a lot
happier when he removed the index on COUNTRY.
this index would take forever and a day shinning up and down two massive
chains of duplicates.
heLen
>Hi!No, you avoid indexes with LOW selectivity. I know it's confusing, but
>
>Everyone always says that one should avoid indices with high
>selectivity,
"highness" corresponds with "proximity to uniqueness". So the index with
the highest possible selectivity is the single-column unique one. The one
with the lowest possible selectivity is the two-phase switch.
>and that they can even make the query slower.:-)
>I understand that.
>But doesn't that depend on how evenly the values are distributed?No...well, you could have an index with moderate selectivity but have all
(or most) rows in the table having one value - that causes selectivity to
be statistically low. I recall a poor fellow in this list who had an app
with an indexed column COUNTRY on a table with millions of rows. Now,
there are something like 300 + country codes worldwide; the problem was,
all of the records in this table had the same country code! He got a lot
happier when he removed the index on COUNTRY.
>Suppose for instance that I have a STATE field on a table, that has twoNo way. Firebird maintains indexes as binary trees. An indexed search on
>possible values, 0 and 1.
>All new records start out in STATE 0, and after a while, they end up in
>STATE 1. Obviously, the number of records in STATE 0 will remain roughly
>constant, whereas the number of records in STATE 1 will be continually
>growing. If (as it is highly likely), I know in advance that I will
>almost never need to select all records in STATE 1, but very frequently
>all records in STATE 0, would it not be advantageous to define an index?
this index would take forever and a day shinning up and down two massive
chains of duplicates.
heLen