Subject Re: is there a way around the IN clause?
Author Svein Erling Tysvær
--- In firebird-support@yahoogroups.com, "Gary Benade" wrote:
> if I change this
>
> select c.name, c.surname, c.telephone, c.lastorderdate
> from customers c
> where not exists (select 1 from orders
> where customerlink = c.link
> and recdate > '01-jan-2004'
> and recdate < '01-mar-2004'
> and branchcode = 'SAN')
> and c.branchcode = 'SAN'
> order by lastorderdate desc,1,2
>
> PLAN SORT ((C INDEX (IDX_CUSTOMERS_BRANCHCODE)))
> PLAN (ORDERS INDEX (ORDERS_CUSTOMERLINK,ORDERS_RECDATE))
>
> to this
>
> select c.name, c.surname, c.telephone, c.lastorderdate
> from customers c
> where not exists (select 1 from orders
> where customerlink = c.link
> and recdate > '01-jan-2004'
> and recdate < '01-mar-2004')
> and c.branchcode = 'SAN'
> order by lastorderdate desc,1,2
>
> PLAN SORT ((C INDEX (IDX_CUSTOMERS_BRANCHCODE)))
> PLAN (ORDERS INDEX (ORDERS_CUSTOMERLINK))
>
> the query runs in, wait for it, 3 seconds initially and 0,2 seconds
> for subsequent queries :) - much more like the firebird I know.
> I just dont understand why the index on recdate is not being used?

Hi Gary!

Maybe because the branchcode is part of the recdate index? Or maybe
the optimizer is confused because you provide an ambiguous query? Try

select c.name, c.surname, c.telephone, c.lastorderdate
from customers c
where not exists (select 1 from orders o
where o.customerlink = c.link
and o.recdate > '01-jan-2004'
and o.recdate < '01-mar-2004'
and o.branchcode = 'SAN')
and c.branchcode = 'SAN'
order by lastorderdate desc,1,2

and see if that gives the same slow result (I think it does).

Anyway, you do not want to use any index on recdate in this case -
well, unless this branch has so few customers that the the number of
orders in total for a two month period is not much greater than the
total number of orders for one customer (i.e. it could be useful if
each customer had 20 orders in total and only 15 customers made one
order within the nearly two month time period of your query). The
reason? Well, it already has a selective index on customer to use for
the orders table. I think of it as "Why bother to look at your entire
orders list for that time period when you already have a list with
only the orders of the customer in question?"

The query I would recommend you if the suggestion above doesn't work,
is this:

select c.name, c.surname, c.telephone, c.lastorderdate
from customers c
where not exists (select 1 from orders o
where o.customerlink = c.link
and o.recdate + 0 between '02-jan-2004' and '29-feb-2004'
and o.branchcode || '' = 'SAN')
and c.branchcode = 'SAN'
order by c.lastorderdate desc,1,2

This way the optimizer cannot use any indexes for o.recdate or o.
branchcode that destroys your performance.

HTH,
Set