Subject Re: [firebird-support] inner join index optimization problem
Author Svein Erling Tysvaer
First, thanks for a nice and thorough problem description!

Then, I'll do some cutting and pasting of your original posting and put
some questions/explanations within:

> SELECT pvl.component_id, pvl.property_id, pv.property_value as
> property_value
> FROM property_value pv, property_value_list pvl
> WHERE pvl.system_id = 8413 AND pvl.test_script_id = 67919 AND
> pv.id=pvl.property_value_id
> ORDER BY pvl.component_id, pvl.property_id, pvl.property_value_id
>
> I would expect it to use the plan
> PLAN SORT (JOIN (PV INDEX(RDB$PRIMARY39), PVL INDEX (RDB$FOREIGN70,
> RDB$FOREIGN73, RDB$FOREIGN72)))

This plan is impossible to use (and encouraging Firebird to use such an
index by adding 'AND pv.id>0' or any other statement that matches all
rows, is a bad idea, unless it changes the order of the aliases in the
plan (which can be done in a better way, anyway), it will invariably
SLOW DOWN the query). PLANs should be read from left to right (and from
down and up). You've decided to start with PV. Then you have to ask
yourself the question. Which VALUE should I look up in this table? The
closest you can get to answer that, is pvl.property_value_id. But, pv is
your first alias of the PLAN. You don't yet know the value of
pvl.property_value_id! Hence, you cannot use the index and have to go
NATURAL.

> This uses: PLAN JOIN (PVLS PVL INDEX (RDB$FOREIGN73, RDB$FOREIGN72),
> PROPERTY_VALUE INDEX (RDB$PRIMARY39))

Yep, something similar to this is another option. pvl.SYSTEM_ID and
pvl.TEST_SCRIPT_ID are compared to fixed values, and hence you can use
RDB$FOREIGN73 and RDB$FOREIGN72. And now that you've started with pvl,
you know the value of pvl.property_value_id and can use RDB$PRIMARY39.

> I don't have much experience optimizing FB queries (yet) so if
> somebody can throw some light on this behavior, it will be greatly
> appreciated.

Well, you've already told us that the last PLAN is quicker than the
first one. The ideal optimizer would have figured this by himself, but
in your case, Firebird failed to do so (Arno might be able to tell why,
but in general I'm very happy with the optimizer and some queries may
require knowledge about the data that you can never expect the optimizer
to have).

I recommend you not to hardcode the plan you desire, but rather try to
negotiate with the optimizer through preventing its first choice. In
your case, you want the optimizer to start with the pvl alias rather
than the pv alias. Preventing it from using an index for the linking
between the tables on the pvl side ought to help the optimizer to make a
better decision. So try adding +0 or ||'' like this (I've also updated
from SQL-89 to SQL-92 - this helps the optimizer and some component sets):

SELECT pvl.component_id, pvl.property_id,
pv.property_value as property_value
FROM property_value pv
JOIN property_value_list pvl ON pv.id=pvl.property_value_id+0
WHERE pvl.system_id = 8413 AND pvl.test_script_id = 67919
ORDER BY pvl.component_id, pvl.property_id, pvl.property_value_id

This prevents the optimizer from using the RDB$FOREIGN70 index. It may
of course still choose

PLAN SORT (JOIN (PV NATURAL, PVL INDEX (RDB$FOREIGN73, RDB$FOREIGN72)))

but I'd be very surprised if it did (the optimizer is your friend, not
your enemy). Normally, hinting like done above gets the plan you desire,
or - occationally - another plan that you did not think of and that may
be better or worse than the one you wanted. In your case, I'm almost
certain it would choose

PLAN JOIN (PVL INDEX (RDB$FOREIGN73, RDB$FOREIGN72),
PV INDEX (RDB$PRIMARY39))

which I'd say is the ideal plan.

HTH,
Set

-so much fuzz for adding absolutely nothing...