Subject Re: [firebird-support] IN, sub-select and select distinct
Author Jerome Bouvattier
Hello Arno,

Thanks for your time.

>
> The view was to remove the IN predicate what should speed up this query.

Yes, I think it's close to my Query #3' :

select distinct pres.productid
from sys_idlists L
inner join presence pres on(L.id = pres.outletid)
where
((l.groupid = 1)
and (pres.valuedate >= '2003-09-01')
and (pres.valuedate < '2003-11-01'))

And it's exactly as fast.

> I hadn't understood that there were also duplicates for productid and that
you
> also want a distinct in de select.

In fact, there are duplicates *only* for productid. ;-)

> Then just try these 2 options :
>
> 1)
> SELECT
> DISTINCT
> pres.productid
> FROM
> presence pres
> JOIN product prod ON (pres.productid = prod.id)
> JOIN V_SYS_IDLISTS v ON (v.ID = pres.outletid)
> WHERE
> (pres.valuedate >= '2003-09-01') and (pres.valuedate < '2003-11-01')

See above.

> 2)
> SELECT
> pres.productid
> FROM
> presence pres
> JOIN product prod ON (pres.productid = prod.id)
> JOIN V_SYS_IDLISTS v ON (v.ID = pres.outletid)
> WHERE
> (pres.valuedate >= '2003-09-01') and (pres.valuedate < '2003-11-01')
> GROUP BY
> pres.productid

Interesting, but it's a little bit slower than the two previous options.
(480ms vs 430ms)

What surprises me, is that I cannot seem to find something at least as fast
as the "explicit IN" version. And I just noticed that if you have two IN
predicates then the "explicit IN" is always faster, w/o distinct.

SELECT PRES.PRODUCTID
FROM PRESENCE PRES
WHERE
PRES.OUTLETID IN (112191, 112193, 112194, 112200, 112201, 112205)
and PRES.VALUEDATE >= '2003-09-01'
and PRES.VALUEDATE < '2003-11-01'
and PRES.STATUS IN (1,2,3)

--> 50ms
--> 330ms with distinct
--> 80ms with Group by !!!

select pres.productid
from sys_idlists L1
inner join presence pres on L1.id = pres.outletid
inner join sys_idlists L2 on pres.status = L2.id
where
L1.groupid = 1 and L2.groupid = 2
and pres.valuedate >= '2003-09-01'
and pres.valuedate < '2003-11-01'

--> 481ms
--> 590ms with distinct
--> 590ms with group by

Again, sys_idlists contains only the 9 required records.

GROUPID ID
1 112191
1 112193
1 112194
1 112200
1 112201
1 112205
2 1
2 2
2 3

I can't believe there is no way to achieve the "explicit IN" performance
with a JOIN, a sub-select or anything else.

Any other clue ?

Best regards.

--
Jerome