Subject | Re: [firebird-support] Hints? |
---|---|
Author | Helen Borrie |
Post date | 2005-04-16T00:49:19Z |
At 03:38 PM 15/04/2005 -0400, you wrote:
It's just not an optimal query when the left-hand table is large, since
each hit on A will require a unique set to be formed from the subquery.
Since only one value of ID2 in A is being considered for each occasion that
the subquery runs, there is no use for the index that you view as
"appropriate". The situation where it is worthwhile to consider the
correlated subquery is when the left-hand set is small. Under those
conditions, the IN() syntax can be speedier than joining across all of the
tables.
The engine *should* resolve this query to the following (table identifier
corrections added here). The EXISTS() predicate will return as soon as it
finds the first eligible row in the subquery set, i.e. the entire set is
not traversed if there is a match:
SELECT a1.NAME
FROM A a1
WHERE exists
(SELECT 1 from B b1
join C c1
on b1.ID1 = c1.ID1
and c1.name = 'FOO'
WHERE b1.ID2 = a1.ID2)
or it might resolve it to this:
SELECT a1.NAME
FROM A a1
WHERE exists
(SELECT 1 from B b1
join C c1
on b1.ID1 = c1.ID1
WHERE b1.ID2 = a1.ID2
and c1.name = 'FOO' )
I don't know which one it actually elects for this query (Arno will know)
but you could experiment with both of them and compare their
performance. If one proves faster than the other and it is also faster
than your query, then you have beaten the optimizer.
Another thing to try is inverting tables B and C in the optimizer to help
the subquery to return faster. Since the subquery is searching a column in
C, I think the search on C.name should be faster that way, i.e.
SELECT a1.NAME
FROM A a1
WHERE exists
(SELECT 1 from C c1
join B b1
on b1.ID1 = c1.ID1
and c1.name = 'FOO'
WHERE b1.ID2 = a1.ID2)
Check the plans generated by the optimizer in all cases. You should get
two plans for each of these queries - one for the main query and one for
the subquery. If you are using a tool that only returns the outer plan,
test them in isql with SET PLAN ON.
However, with "tens of thousands of rows" in A, no correlated subquery is
going to beat a join.
./hb
>Does Firebird support table or join hints? I can't get Firebird to use theWithout rewriting the query, no.
>appropriate index when returning results from a subquery even when there
>exists an index on the joined column. My query is fairly complex, but even
>when I break it down to the following, it won't use the index on ID2 of
>table A. Table A contains tens of thousands of rows.
>
>
>SELECT A.NAME
>FROM A
>WHERE A.ID2 IN
> (SELECT B.ID2
> FROM B, C
> WHERE B.ID1 = C.ID1
> AND C.NAME = 'FOO')
>
>Any suggestions other than rewriting the query?
It's just not an optimal query when the left-hand table is large, since
each hit on A will require a unique set to be formed from the subquery.
Since only one value of ID2 in A is being considered for each occasion that
the subquery runs, there is no use for the index that you view as
"appropriate". The situation where it is worthwhile to consider the
correlated subquery is when the left-hand set is small. Under those
conditions, the IN() syntax can be speedier than joining across all of the
tables.
The engine *should* resolve this query to the following (table identifier
corrections added here). The EXISTS() predicate will return as soon as it
finds the first eligible row in the subquery set, i.e. the entire set is
not traversed if there is a match:
SELECT a1.NAME
FROM A a1
WHERE exists
(SELECT 1 from B b1
join C c1
on b1.ID1 = c1.ID1
and c1.name = 'FOO'
WHERE b1.ID2 = a1.ID2)
or it might resolve it to this:
SELECT a1.NAME
FROM A a1
WHERE exists
(SELECT 1 from B b1
join C c1
on b1.ID1 = c1.ID1
WHERE b1.ID2 = a1.ID2
and c1.name = 'FOO' )
I don't know which one it actually elects for this query (Arno will know)
but you could experiment with both of them and compare their
performance. If one proves faster than the other and it is also faster
than your query, then you have beaten the optimizer.
Another thing to try is inverting tables B and C in the optimizer to help
the subquery to return faster. Since the subquery is searching a column in
C, I think the search on C.name should be faster that way, i.e.
SELECT a1.NAME
FROM A a1
WHERE exists
(SELECT 1 from C c1
join B b1
on b1.ID1 = c1.ID1
and c1.name = 'FOO'
WHERE b1.ID2 = a1.ID2)
Check the plans generated by the optimizer in all cases. You should get
two plans for each of these queries - one for the main query and one for
the subquery. If you are using a tool that only returns the outer plan,
test them in isql with SET PLAN ON.
However, with "tens of thousands of rows" in A, no correlated subquery is
going to beat a join.
./hb