Subject | Re: Creating index |
---|---|
Author | Adam |
Post date | 2006-02-12T01:04:24Z |
> Is it a good choice to create an index on a field that only forGenerally no.
> checking null/not null, Yes/No, 'Y'/'N'?
>
There are exceptions if the field values are skewed. For example, if
99% of the values are 'Y', and only 1% are 'N', then the index would
be very useful to perform where field = 'N'.
When you have an index, Firebird will build a bitmap of the values
that meet the criteria by reading the index pages. When it knows the
page where the data of interest is located, it will then go to the
page and read it, then back to the index to get the next record and so
on.
Where no index is available, the records are read in storage order and
evaluated as to whether they meet the condition. There is a lot less
jumping around with this natural read, but it does mean that Firebird
needs to read every record to decide.
In the case where the index does not eliminate most of the records,
the overhead in checking the index, going to the data page, back to
the index back to the data page will be more expensive then where no
index is available. This is one of the problems I suppose with the
Firebird implementation of foreign key constraints. These constraints
will automatically create an index on the field in the relationship,
and these relationships often have poor selectivity.
Also keep in mind that you may want to index such a field in a
superkey. Combining these fields with other conditions may actually be
quite useful.
At the end of the day, no optimiser is perfect. There will always be
cases where what you know about the shape of the data is superior to
what the dbms can reasonably be expected to know. In these cases, you
can modify the expression so that the index can not be used.
For example, in one of our tables we have a foreign key to the site
table. For many customers, there are only a few sites, so this FK
index does more harm then good when used in a query.
If you wanted to get all the records for SiteID 1,
select *
from table
where SiteID = 1
The optimiser would use the FK index, but it may be cheaper to just
read table naturally. The following table will not use this index
because the expression SiteID+0 is not indexed (even though we know
this is identical).
select *
from table
where SiteID+0 = 1
In the second case, the query runs significantly faster. Now of course
if the data were shaped differently with a lot of possible Site
Records and not a huge number of records in this table, then the index
would be very useful.
Of course if you are only after a straight answer, my first two words
gives that. But your job as the database designer and as a developer
is to understand how Firebird will execute every statement you give
it. Understand the limitations on what it actually knows about data
shape and you can really make it fly. I have on many occasions reduced
the execution time of a query from minutes to subsecond.
But also understand that the optimiser will know stuff you don't.
Perhaps different customers have different shaped data, or the cache
already contains particular indices that can be used. This is why you
should not explicitly force a plan, but rather let the optimiser
choose its own execution path, and remove options to use an index you
know is not helping. An index added in the future that may be
extremely beneficial to the query. If you hard code a PLAN, then the
optimiser will not even bother analysing the possibility of using it,
whereas if you just use the +0 trick to disable the use of a specific
index then it may often be able to just run faster.
Adam