firebird-support - Re: [firebird-support] Index Selectivity

Subject	Re: [firebird-support] Index Selectivity
Author	Arno Brinkman
Post date	2003-12-08T12:36:45Z

Hi,

> How do I know if the following indexes have too many duplicates? In other
> words, when is a chain of duplicates "too long", and when should I

consider

> dropping the index?

There isn't a hard correlation between nodes vs duplicates that i'm aware of
that decide to drop a index. This depends a litlle on what data is on that
index and when the index is used by your queries. Extreme: When you've a
index on a boolean with many FALSE values and only a few TRUE values. On
this index the duplicates would be a lot, but if you ONLY filter on TRUE
values then this index is helpfull, because it has to return just a few
records. But when you ask for FALSE values then almost the complete index
needs to be visit and finally almost all records are returned. The
index-search slows down the retrieval here.

> Index IDX_LOGSHIFTNUM (1)
> Depth: 3, leaf buckets: 1149, nodes: 179393
> Average data length: 0.00, total dup: 179090, max dup: 1400

179090 duplicates out of 179393 nodes is a lot i would drop this index
unless there a very specic reason for.

> Index IDX_LOGSTOCK (2)
> Depth: 3, leaf buckets: 1904, nodes: 179394
> Average data length: 0.00, total dup: 178417, max dup: 32753

178417 duplicates out of 179394 nodes is also a lot i would drop this index
too.

Same goes for :

> Index IDX_LOG_CLIENT (4)
> Depth: 3, leaf buckets: 1422, nodes: 179393
> Average data length: 0.00, total dup: 179345, max dup: 32675
> Index IDX_LOG_CLNT_SUB (6)
> Depth: 3, leaf buckets: 1416, nodes: 179393
> Average data length: 0.00, total dup: 179386, max dup: 32675
> Index IDX_OPT (3)
> Depth: 3, leaf buckets: 1938, nodes: 179394
> Average data length: 0.00, total dup: 179357, max dup: 32698

> Index IDX_UPDATELOG (9)
> Depth: 5, leaf buckets: 9734, nodes: 179394
> Average data length: 29.00, total dup: 4902, max dup: 2748

This index hasn't so much duplicates thus could be usefull.

> And what is meant by "fill distribution"?

Fill distribution is how much data is used on the page size.
For example with a page size from 4096 bytes and 80 pages with 50%
fill-distribution then there's 160KB wasted space. This is not necessary a
problem, because nodes are inserted in the middle of the page. When you
decactive/active a index then the index is rebuild and you would see a high
fill distribution.

Regards,
Arno Brinkman
ABVisie

-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
Firebird links :
http://www.firebirdsql.com
http://www.firebirdsql.info
http://www.fingerbird.de/
http://www.comunidade-firebird.org/

Nederlandse firebird nieuwsgroep :
news://80.126.130.81