Subject | Optimizer not very good at IN/EXISTS |
---|---|
Author | Mathias Dellaert |
Post date | 2006-08-08T11:20Z |
Greetings,
I've noticed that the optimizer seems to be very inefficient when
working with IN/EXISTS clauses. Here's an example
SELECT * FROM SALES S
WHERE SID IN (SELECT MIN(SID)
FROM SALES S2
GROUP BY Product)
Uses plan:
PLAN (S2 ORDER IDX_SALES_BASE)
PLAN (S NATURAL)
(and then pretty much freezes up on execute)
(IDX_SALES_BASE is an index over the product field, SID is the primary
key)
Similarly, using an EXISTS:
select * from SALES S
WHERE EXISTS (SELECT Product
FROM SALES S2
GROUP BY Product
HAVING MIN(S2.SID) = S.SID )
Uses the same plan.
Where-as the "equivalent" query (in Firebird 2.0 RC2)
select Sales.* from
SALES,
(SELECT MIN(SID) as SID
FROM SALES
GROUP BY product) as Blah
WHERE Sales.SID = Blah.SID
Uses plan:
PLAN JOIN (BLAH SALES ORDER IDX_SALES_BASE, SALES INDEX (PK_SALES))
Note how this one actually uses the primary key.
Another note: simpler queries generate similar plans, like (and yes I
realise this is a silly query)
SELECT * FROM SALES S
WHERE SID IN (SELECT SID
FROM SALES S2
WHERE product='PC')
Which uses
PLAN (S2 INDEX (PK_SALES))
PLAN (S NATURAL)
Which seems even less efficient to me (since it doesn't even use the
product index).
For comparison:
SELECT SID FROM SALES S2 WHERE product = 'PC'
Uses
PLAN (S2 INDEX (IDX_SALES_BASE))
Now my question: are there any plans to improve the behaviour of the
IN/EXISTS clause? Is this a bug? It seems to me that now Firebird 2 has
greater subquery support, there's plenty of room for improvement here.
Regards,
MD
[Non-text portions of this message have been removed]
I've noticed that the optimizer seems to be very inefficient when
working with IN/EXISTS clauses. Here's an example
SELECT * FROM SALES S
WHERE SID IN (SELECT MIN(SID)
FROM SALES S2
GROUP BY Product)
Uses plan:
PLAN (S2 ORDER IDX_SALES_BASE)
PLAN (S NATURAL)
(and then pretty much freezes up on execute)
(IDX_SALES_BASE is an index over the product field, SID is the primary
key)
Similarly, using an EXISTS:
select * from SALES S
WHERE EXISTS (SELECT Product
FROM SALES S2
GROUP BY Product
HAVING MIN(S2.SID) = S.SID )
Uses the same plan.
Where-as the "equivalent" query (in Firebird 2.0 RC2)
select Sales.* from
SALES,
(SELECT MIN(SID) as SID
FROM SALES
GROUP BY product) as Blah
WHERE Sales.SID = Blah.SID
Uses plan:
PLAN JOIN (BLAH SALES ORDER IDX_SALES_BASE, SALES INDEX (PK_SALES))
Note how this one actually uses the primary key.
Another note: simpler queries generate similar plans, like (and yes I
realise this is a silly query)
SELECT * FROM SALES S
WHERE SID IN (SELECT SID
FROM SALES S2
WHERE product='PC')
Which uses
PLAN (S2 INDEX (PK_SALES))
PLAN (S NATURAL)
Which seems even less efficient to me (since it doesn't even use the
product index).
For comparison:
SELECT SID FROM SALES S2 WHERE product = 'PC'
Uses
PLAN (S2 INDEX (IDX_SALES_BASE))
Now my question: are there any plans to improve the behaviour of the
IN/EXISTS clause? Is this a bug? It seems to me that now Firebird 2 has
greater subquery support, there's plenty of room for improvement here.
Regards,
MD
[Non-text portions of this message have been removed]