Subject Re: Help on query performing natural scan
Author Alexander V.Nevsky
--- In firebird-support@yahoogroups.com, Bob Murdoch <ram5@e...>
wrote:
> I have a query that involves two tables that form a master-detail
> relationship, using FB1.5. When just those two tables are involved,
I get
> indexed reads on both tables with the proper indices being used.
However,
> when I add a third table to query, the foreign key index of the
detail
> table no longer gets used in the query - a natural scan is
performed. As
> well, the correct index is no longer used in the master table
either.
>
> Here's the first query that used the correct indices:
>
> select
> m.id, m.sale_id, m.sku, m.line_nbr, sum(m.qty) as qty
> from
> salescheck sc
> join sale_mdse m on (m.sale_id = sc.id)
> where
> (sc.del_date = '1/24/2004') and
> (sc.region = 10)
> group by
> m.id, m.sale_id, m.sku, m.line_nbr
>
> PLAN SORT (JOIN (SC INDEX (SALESCHECK_DEL_DATE_IDX),M INDEX
(RDB$FOREIGN174)))

Bob, here optimizer decided it is good to filter SC by sc.del_date
using index and than join result with M using it's foreign key to SC.

> In this second query, I need to pull some information from a third
table:
>
> select
> m.id, m.sale_id, m.sku, m.line_nbr, sum(m.qty) as qty
> from
> salescheck sc
> join sale_mdse m on (m.sale_id = sc.id)
> join sku sk on (sk.sku_nbr = m.sku) and (sk.account_id =
sc.account_id)
> where
> (sc.del_date = '1/24/2004') and
> (sc.region = 10)
> group by
> m.id, m.sale_id, m.sku, m.line_nbr
>
> PLAN SORT (JOIN (M NATURAL,SC INDEX (RDB$PRIMARY105),SK INDEX
> (RDB$PRIMARY110)))

Beacuse of unknown for me reason (wise persons says - pessimizer
estimates amount of records in the tables and indices selectivity ;) )
this time it decided it will be better to scan M naturally and search
both other tables for records corresponding to records in M using join
conditions together with their primary keys. BTW, sometimes it is
right and it is really most fast way to get results. Sometimes ;)
Easiest way to quickly check it - repeat query forcing plan which you
think is better:

select
m.id, m.sale_id, m.sku, m.line_nbr, sum(m.qty) as qty
from
salescheck sc
join sale_mdse m on (m.sale_id = sc.id)
join sku sk on (sk.sku_nbr = m.sku) and (sk.account_id =
sc.account_id)
where
(sc.del_date = '1/24/2004') and
(sc.region = 10)
group by
m.id, m.sale_id, m.sku, m.line_nbr
PLAN JOIN (SC INDEX (SALESCHECK_DEL_DATE_IDX),
M INDEX (RDB$FOREIGN174),
SK INDEX (RDB$PRIMARY110))

and compare time. If you are right it is better to rewrite query to
avoid explicit planning (if for some reason you in the future will be
forced to drop and recreate constrains or re-build database from
script, numbers of system indices will/can be changed). To reach
wanted result particularly in this query, try this:

select
m.id, m.sale_id, m.sku, m.line_nbr, sum(m.qty) as qty
from
salescheck sc
join sale_mdse m on (m.sale_id = sc.id+0) /*!!!*/
join sku sk on (sk.sku_nbr = m.sku) and (sk.account_id =
sc.account_id)
where
(sc.del_date = '1/24/2004') and
(sc.region = 10)
group by
m.id, m.sale_id, m.sku, m.line_nbr

introduced expression leads to impossibility to use SC INDEX
(RDB$PRIMARY105) and optimizer can go back to use SC INDEX
(SALESCHECK_DEL_DATE_IDX) and wanted sequence of joining tables.
Perhaps ;)
One more suggestion - try to refresh statistics of the indices and
repeat initial query on 3 tables, it can help optimizer if available
for him statistic is outdated (it is'nt automatically recalculated
when data is modified).

Best regards,
Alexander.