Subject | Re: [firebird-support] Selectivity Question |
---|---|
Author | Helen Borrie |
Post date | 2003-11-05T04:57:20Z |
At 10:22 PM 4/11/2003 -0500, you wrote:
2. Do a query to get the current selectivity for your indexes:
SELECT RDB$INDEX_NAME, RDB$STATISTICS
FROM RDB$INDICES
WHERE RDB$RELATION_NAME NOT STARTING WITH 'RDB$'
Here's what Ann says regarding this magic number in RDB$STATISTICS:
It is the selectivity. It's set when index fast-load runs (e.g. restore
from a backup, reactivate) and as a result of the
SET STATISTICS INDEX <name>;
command. The formula is simple.
-- If the index is empty, then the value is 0.
-- Otherwise, the value is
1 divided by (the number of index entries - the number of duplicates).
The smaller the selectivity value, the better the index.
I'd add, the larger the selectivity value, the LOWER the selectivity of the
index. A unique index has the lowest selectivity value possible and it has
the highest possible selectivity.
However, you should also run
gstat -index db_path_and_name -t table-name
to look at the figures on the index trees on each index. Look at the
figures for nodes (= number of entries in the index), total dup (the number
of nodes that have duplicates) and max dup (the length of the longest
duplicate chain). These will tell you useful stuff about the geometry of
the index. Long duplicate chains are a sign of an index you should drop.
h.
>Hello all,1. Run SET STATISTICS on the database containing real-life data.
>
>Are there any hard rules about selectivity for indexes and the numbers
>that I should look for? I in the process of writing an app that will
>house many record in a FB 1.5 DB and I want to make sure that I have not
>introduced indices that will adversely affect performance.
2. Do a query to get the current selectivity for your indexes:
SELECT RDB$INDEX_NAME, RDB$STATISTICS
FROM RDB$INDICES
WHERE RDB$RELATION_NAME NOT STARTING WITH 'RDB$'
Here's what Ann says regarding this magic number in RDB$STATISTICS:
It is the selectivity. It's set when index fast-load runs (e.g. restore
from a backup, reactivate) and as a result of the
SET STATISTICS INDEX <name>;
command. The formula is simple.
-- If the index is empty, then the value is 0.
-- Otherwise, the value is
1 divided by (the number of index entries - the number of duplicates).
The smaller the selectivity value, the better the index.
I'd add, the larger the selectivity value, the LOWER the selectivity of the
index. A unique index has the lowest selectivity value possible and it has
the highest possible selectivity.
However, you should also run
gstat -index db_path_and_name -t table-name
to look at the figures on the index trees on each index. Look at the
figures for nodes (= number of entries in the index), total dup (the number
of nodes that have duplicates) and max dup (the length of the longest
duplicate chain). These will tell you useful stuff about the geometry of
the index. Long duplicate chains are a sign of an index you should drop.
h.