Subject | IN, sub-select and select distinct |
---|---|
Author | Jerome Bouvattier |
Post date | 2004-03-11T17:18:17Z |
Hello list,
I'm trying, without success, to find how to replace an explicit IN clause by
a sub-select (or a join) in a "select distinct..." query without losing
performance. I'd really appreciate any help.
Questions at the bottom.
Given, the following query :
Query #1 (explicit IN):
==========
select pres.productid
from presence pres
where
((pres.outletid in (112191, 112193, 112194, 112200, 112201, 112205))
and (pres.valuedate >= '2003-09-01')
and (pres.valuedate < '2003-11-01'))
I want to get the object of the IN clause from a sub-query.
Which is
"select id from sys_idlists where groupid = 1)"
and returns exactly the same integers list as the initial IN clause.
The table sys_idlists *only* contains the 6 records.
this gives :
Query #2 (sub-select):
======================
select pres.productid
from presence pres inner join product prod on (pres.productid = prod.id)
where
((pres.outletid in (select id from sys_idlists where groupid = 1))
and (pres.valuedate >= '2003-09-01')
and (pres.valuedate < '2003-11-01'))
another variation would be :
Query #3 (join):
================
select pres.productid
from sys_idlists l
inner join presence pres on(l.id = pres.outletid)
where
((l.groupid = 1)
and (pres.valuedate >= '2003-09-01')
and (pres.valuedate < '2003-11-01'))
Results are :
Q1 : 50 ms
Q2 : 440 ms
Q3 : 10 ms
Plans are :
Q1 : PLAN (PRES INDEX
(PRESENCE_IDX1,RELATION_262,RELATION_262,RELATION_262,RELATION_262,RELATION_
262,RELATION_262))
Q2 : PLAN JOIN (L NATURAL,PRES INDEX (RELATION_262))
Q3 : PLAN (SYS_IDLISTS NATURAL) PLAN JOIN (PRES INDEX (PRESENCE_IDX1),PROD
INDEX (PK_PRODUCT))
All 3 queries return the same set.
Question 1 :
===========
Can I customize Q2 so that the sub-select is evaluated only once in order to
perform as fast as Q1 ? Or Q3 is the only way to go ?
Question 2 :
===========
I would be glad with Q3, but I need to apply a "Distinct"
(select distinct pres.productid from..). Then perf become :
Q1' : 301 ms
Q2' : 10 s 4 ms
Q3' : 431 ms
Q1' : PLAN SORT ((PRES INDEX
(PRESENCE_IDX1,RELATION_262,RELATION_262,RELATION_262,RELATION_262,RELATION_
262,RELATION_262)))
Q2' : PLAN (SYS_IDLISTS NATURAL) PLAN SORT (JOIN (PRES INDEX
(PRESENCE_IDX1),PROD INDEX (PK_PRODUCT)))
Q3' : PLAN SORT (JOIN (L NATURAL,PRES INDEX (RELATION_262)))
How should my SQL look like to perform as fast as Q1' but without an
explicit IN clause ?
Thanks a lot for your help.
--
Jerome
I'm trying, without success, to find how to replace an explicit IN clause by
a sub-select (or a join) in a "select distinct..." query without losing
performance. I'd really appreciate any help.
Questions at the bottom.
Given, the following query :
Query #1 (explicit IN):
==========
select pres.productid
from presence pres
where
((pres.outletid in (112191, 112193, 112194, 112200, 112201, 112205))
and (pres.valuedate >= '2003-09-01')
and (pres.valuedate < '2003-11-01'))
I want to get the object of the IN clause from a sub-query.
Which is
"select id from sys_idlists where groupid = 1)"
and returns exactly the same integers list as the initial IN clause.
The table sys_idlists *only* contains the 6 records.
this gives :
Query #2 (sub-select):
======================
select pres.productid
from presence pres inner join product prod on (pres.productid = prod.id)
where
((pres.outletid in (select id from sys_idlists where groupid = 1))
and (pres.valuedate >= '2003-09-01')
and (pres.valuedate < '2003-11-01'))
another variation would be :
Query #3 (join):
================
select pres.productid
from sys_idlists l
inner join presence pres on(l.id = pres.outletid)
where
((l.groupid = 1)
and (pres.valuedate >= '2003-09-01')
and (pres.valuedate < '2003-11-01'))
Results are :
Q1 : 50 ms
Q2 : 440 ms
Q3 : 10 ms
Plans are :
Q1 : PLAN (PRES INDEX
(PRESENCE_IDX1,RELATION_262,RELATION_262,RELATION_262,RELATION_262,RELATION_
262,RELATION_262))
Q2 : PLAN JOIN (L NATURAL,PRES INDEX (RELATION_262))
Q3 : PLAN (SYS_IDLISTS NATURAL) PLAN JOIN (PRES INDEX (PRESENCE_IDX1),PROD
INDEX (PK_PRODUCT))
All 3 queries return the same set.
Question 1 :
===========
Can I customize Q2 so that the sub-select is evaluated only once in order to
perform as fast as Q1 ? Or Q3 is the only way to go ?
Question 2 :
===========
I would be glad with Q3, but I need to apply a "Distinct"
(select distinct pres.productid from..). Then perf become :
Q1' : 301 ms
Q2' : 10 s 4 ms
Q3' : 431 ms
Q1' : PLAN SORT ((PRES INDEX
(PRESENCE_IDX1,RELATION_262,RELATION_262,RELATION_262,RELATION_262,RELATION_
262,RELATION_262)))
Q2' : PLAN (SYS_IDLISTS NATURAL) PLAN SORT (JOIN (PRES INDEX
(PRESENCE_IDX1),PROD INDEX (PK_PRODUCT)))
Q3' : PLAN SORT (JOIN (L NATURAL,PRES INDEX (RELATION_262)))
How should my SQL look like to perform as fast as Q1' but without an
explicit IN clause ?
Thanks a lot for your help.
--
Jerome