Subject Re: [firebird-support] Primary Keys / Indexes
Author Ann W. Harrison
volhoop wrote:
> My understanding of how Firebird uses Primary keys and indexes are below.
> If there is a Primary Key or Index available, the Optimizer will use it.
>
> I come from an AS400 environment. The programmer chooses the index
> (logical) to be used not the database. How does the optimizer choose,
> and what types of problems arise from that situation? Can I choose
> the index to be used rather than the optimizer?

One of the really good ideas behind relational databases - as opposed
to network or CODASYL databases that were the gold standard in the
early eighties - is the separation of content and access strategy,
which generalizes as data abstraction.

The code that chooses an access strategy is called an optimizer -
optimiser for the east side of the Atlantic. Optimizers are of
three sorts, rule based, cost based, and mixed. Generally, academic
optimizers are rule based, commercial optimizers are cost based and
all optimizers end up mixing the two techniques. The Firebird
optimizer is primarily cost based. What that means is that it
determines the tables to be accessed, the links between them (called
conjuncts), the access paths through those links, the number of
records in each table, and the number of records that are likely
to come through each conjunct.

For example, if you have a table of classes and a table of teachers
and the link between them is the teacher id which is present in the
class table and the primary key of the teacher table, you know that
for each class table, there will be exactly one record from the teacher
table through that conjunct.

What the optimizer does is pick the lowest cost path through the tables
in the query. It will run into problems if the count (aka cardinality)
or restrictiveness (selectivity) are wrong. The cardinality is seldom
wrong, because it is based on the physical size of the table compared
to the physical size of a record. Selectivity can go wrong if data
changes radically after an index is created. You can reset the
selectivity with SET SELECTIVITY.

Something that Firebird does that most databases can not is use
multiple indexes on a single table. For example, suppose in the
case above, you want all Spanish classes taught by tenured teachers.
The optimizer would notices that the fastest path is to find the
Spanish classes, then look up the teacher using the primary key
index. If the teachers table has an index on the "status" field
that includes whether the tenure status, Firebird can eliminate
non-tenured teachers without looking at the records.

You can also specify a PLAN and tell the optimizer to keep its
opinions to itself. The problems with plans are that

1) generally, the optimizer is pretty good at figuring out what
to do.

2) data distributions change, but plans don't

3) if the optimizer gets something wrong and you go to a plan, the
optimizer developers have missed an opportunity to improve their code.


Regards,


Ann