Subject Re: [firebird-support] Why is Firebird doing a natural scan?
Author Alexandre Benson Smith
Carsten Schäfer wrote:
> I have simple case where firebird is doing a natural scan when it imo
> should use an index.
> (Firebird 2.1.2 RC2 Superserver on Windows Vista 64bit)
>
> Query is:
> SELECT apos.f_nr FROM t_lieferscheinposition pos join t_apos apos on
> pos.f_id_apos = apos.id_apos where pos.f_id_lieferschein = 100;
> Plan
> PLAN JOIN (APOS NATURAL, POS INDEX (FK_LP_APOS, FK_LP_LI))
> Adapted Plan
> PLAN JOIN (APOS NATURAL, POS INDEX (FK_LP_APOS, FK_LP_LI))
>
> DDL is (simplified):
> CREATE TABLE T_LIEFERSCHEINPOSITION (
> ID_LIEFERSCHEINPOSITION INTEGER NOT NULL,
> F_ID_LIEFERSCHEIN INTEGER NOT NULL,
> F_ID_APOS INTEGER NOT NULL
> );
>
> ALTER TABLE T_LIEFERSCHEINPOSITION ADD CONSTRAINT
> PK_LIEFERSCHEINPOSITION PRIMARY KEY (ID_LIEFERSCHEINPOSITION);
>
> ALTER TABLE T_LIEFERSCHEINPOSITION ADD CONSTRAINT FK_LP_APOS FOREIGN KEY
> (F_ID_APOS) REFERENCES T_APOS (ID_APOS) ON DELETE CASCADE ON UPDATE NO
> ACTION;
> ALTER TABLE T_LIEFERSCHEINPOSITION ADD CONSTRAINT FK_LP_LI FOREIGN KEY
> (F_ID_LIEFERSCHEIN) REFERENCES T_LIEFERSCHEIN (ID_LIEFERSCHEIN) ON
> DELETE CASCADE ON UPDATE NO ACTION;
>
> CREATE TABLE T_APOS (
> ID_APOS INTEGER NOT NULL,
> F_NR VARCHAR(25)
> );
>
> ALTER TABLE T_APOS ADD UNIQUE (F_NR);
>
> ALTER TABLE T_APOS ADD PRIMARY KEY (ID_APOS);
>
>
> When i change the query to use a left join no natural scan is used:
> SELECT apos.f_nr FROM t_lieferscheinposition pos left join t_apos apos
> on pos.f_id_apos = apos.id_apos where pos.f_id_lieferschein = 100;
> Plan
> PLAN JOIN (POS INDEX (FK_LP_LI), APOS INDEX (RDB$PRIMARY36))
> Adapted Plan
> PLAN JOIN (POS INDEX (FK_LP_LI), APOS INDEX (INTEG_534))
>
> Can someone please explain me this behaviour?
> Why does the left join make the difference?
> For every value in t_lieferscheinposition there must be a value in
> t_apos because of the foreign key.
> So imo the result when using an inner join should always be the same as
> when i use a left join.
>
> best regards,
> Carsten
>

An outer join limits the choice of stream order to be used, when you put
a left join the first table *must* be POS, so the optimizer starts by
the POS table and finds the index usefull, so uses it, and then join to
APOS table using the PK index.

The first query without the outer join lets the optimizer use 2 distinct
orders to retrieve data (POS or APOS first), for some reason the
optimizer thinks that would be cheaper to use the APOS as the first
table and then use the FK index to join to the POS table.

The reasons could be some of those:
1.) Index statistics out of date, so the optimizer thinks that the
FK_LP_LI is not a good index to filter out the records and the after
appling the filter the amount of records would be greater than whole
APOS table
2.) APOS table has very few records.

The first thing I would try is to update the index selectivity using SET
STATISTCS.

Another approach would be rewrite your query as

SELECT apos.f_nr FROM t_lieferscheinposition pos join t_apos apos on
pos.f_id_apos = apos.id_apos+0 where pos.f_id_lieferschein = 100;



see you !

--
Alexandre Benson Smith
Development
THOR Software e Comercial Ltda
Santo Andre - Sao Paulo - Brazil
www.thorsoftware.com.br