Subject | Re: [firebird-support] Re: Index analysis - what does mean nodes, dup and max dup? |
---|---|
Author | Ann W. Harrison |
Post date | 2008-11-22T20:26:33Z |
Elton,
example of a field that should not be indexed is sex - where there are
two values and about equal distribution. On the other hand, if you
had a table of inhabitants of a nunnery, an index on sex would be good
if you often wanted to find the males.
The worst part of bad indexes was their effect on garbage collection.
In older versions of Firebird, removing duplicate entries from an index
was very expensive because the the duplicate chains were ordered so
the newest records were at the front, but records are stored with the
new values last (normally). Removing the oldest duplicate (which is
likely to be the first one removed in a major clean-up) meant reading
the whole duplicate chain and releasing the last entry. Removing the
second oldest meant reading the chain again... and so on.
Newer versions order duplicates by record number which makes garbage
collection much faster.
for garbage collection. If you use those fields in WHERE or ON clauses,
the indexes probably do help.
your application with different indexes enabled to give you a reality
check. (What's the difference between theory and practice? In theory,
there is no difference.)
Good luck,
Ann
>It's hard to come up with really good rules about indexes. The normal
>>> that´s mean the first index is better than this second?
>> No, indexes are good if they make queries faster, bad otherwise.
>
> so only running a query will I can say it is a bad or good index?
example of a field that should not be indexed is sex - where there are
two values and about equal distribution. On the other hand, if you
had a table of inhabitants of a nunnery, an index on sex would be good
if you often wanted to find the males.
The worst part of bad indexes was their effect on garbage collection.
In older versions of Firebird, removing duplicate entries from an index
was very expensive because the the duplicate chains were ordered so
the newest records were at the front, but records are stored with the
new values last (normally). Removing the oldest duplicate (which is
likely to be the first one removed in a major clean-up) meant reading
the whole duplicate chain and releasing the last entry. Removing the
second oldest meant reading the chain again... and so on.
Newer versions order duplicates by record number which makes garbage
collection much faster.
> this is an more actual stats. Only viewing it I can say if is good orWell, with Firebird 1.5 or older, the second index could be a problem
> bad indexes?
> Index SAIDAPRODUTOS_IDX2 (2)
> Depth: 3, leaf buckets: 843, nodes: 698840
> Average data length: 0.01, total dup: 698609, max dup: 4158
> Index SAIDAPRODUTOS_IDX3 (3)
> Depth: 3, leaf buckets: 1349, nodes: 698840
> Average data length: 0.01, total dup: 697202, max dup: 99598
for garbage collection. If you use those fields in WHERE or ON clauses,
the indexes probably do help.
>I've given you the theory of it, but there's nothing like benchmarking
>
> Thank you for help again. sorry this topic is taking so long, but I
> want understand indexes things.
>
your application with different indexes enabled to give you a reality
check. (What's the difference between theory and practice? In theory,
there is no difference.)
Good luck,
Ann