Subject Re: [firebird-support] Query Optimization
Author Ann W. Harrison
marc_guillot wrote:
>
> ¿ Somebody knows why a very simple query like that doesn't optimizes
> with indexes ?
>
> select VTA_ID
> from VT_VENTAS
> inner join CL_CLIENTES on CLI_ID = VTA_CLI_ID
> inner join VT_GAFAS on GAF_VTA_ID = VTA_ID
> where VTA_FECHA is not null
>
> PLAN JOIN (CL_CLIENTES NATURAL,VT_VENTAS INDEX
> (VT_VENTAS_IDX1),VT_GAFAS INDEX (VT_GAFAS_IDX1))
>
> CLI_ID is a primary key of CL_CLIENTES.

That plan does use indexes on VT_VENTAS and VT_GAFAS. The expression
"is not null" probably includes too many records to be useful for an
indexed lookup. The order of joins depends on table cardinality and
index selectivity. You can improve the choice by issuing a "SET
STATISTICS" statement on your indexes from time to time. In this case,
the optimizer thinks that the minimal intermediate product is created by
reading the CL_CLIENTES table, using values from there to lookup
matching VT_VENTAS records and using the value from that record to look
up matching VT_GAFAS records. With inner joins, Firebird chooses the
join order it calculates to be best (cost based optimization),
regardless of the order you list the tables (syntax based optimization).
>
> If I changes the query to
>
> select VTA_ID
> from VT_VENTAS
> left outer join CL_CLIENTES on CLI_ID = VTA_CLI_ID
> inner join VT_GAFAS on GAF_VTA_ID = VTA_ID
> where VTA_FECHA is not null
>
> PLAN JOIN (JOIN (VT_VENTAS NATURAL,CL_CLIENTES INDEX
> (PK_CL_CLIENTES)),VT_GAFAS INDEX (VT_GAFAS_IDX1))
>
> Now CL_CLIENTES uses Index in primary key but VT_VENTAS not uses Index
> VT_VENTAS_IDX1 for field VTA_FECHA. (but the result set is not exactly
> what I'am looking for).
>

When you switch to outer joins, you constrain the choice of join order.
You're saying that you want all VT_VENTAS records whether or not they
match records in CL_CLIENTES, so the query can't do nested loops
starting with CL_CLIENTES. So it starts with a loop through the
VT_VENTAS records - again, the condition in the where clauses is not
restrictive enough to be the source of an indexed lookup - then looks up
CL_CLIENTES by primary key and VT_GAFAS using the same index as previously.

Without knowing the cardinality and selectivity of your data, I can't
say for sure that the optimizer is making a mistake, but both those
solutions look OK to me.

Regards,


Ann