firebird-support - Re: [firebird-support] perf WHERE IN (SUBQUERY)

Subject	Re: [firebird-support] perf WHERE IN (SUBQUERY)
Author	Svein Erling Tysvaer
Post date	2006-09-09T21:39:28Z

legrand_legrand_63 wrote:

> Here is a simple test case:
>
> select * from a
> where a.id in (select b.id from b where l ='a') ;
>
> gives:
> PLAN (B INDEX (B_ID)) PLAN (A NATURAL)
>
> Would it be possible to access A with index A_ID ?
> like in:
>
> select * from a
> where a.id in (1,2)
>
> PLAN (A INDEX (A_ID, A_ID))
>
> Regards
> PAscal

Nope, Pascal, that's impossible. What you're asking Firebird to do, is
to for each record in A, to select all records in B with a matching L
and then compare. That means you're asking Firebird to perform the query
on B as many times as there are records in A (a very bad idea on largish
tables). Now, from Firebird 1.5 onwards (I think), Firebird isn't going
to honour your request, rather, Arno decided to change such a request
into the equivalent (and potentially much quicker)

select * from a
where exists(select * from b where b.id = a.id and b.l = 'a')

Still, the query has to check every if there is a matching B for every
record in A, and if it has to go through the entire table, there is no
quicker way than to go NATURAL.

A way to improve your query, could be

select a.* from a
join b on a.id = b.id
where b.l = 'a'

which could give you the following plan (I'm a bit uncertain about the
syntax):

PLAN JOIN (B INDEX (B_L), A INDEX(A_ID))

For this to return the same set as your original query, B.ID must be
unique. If not, you have to add DISTINCT (which in itself may be
incorrect if there are supposed to be duplicate rows in A) or a further
NOT EXISTS clause (this may be tailored to return the exact same result
regardless of duplicates).

I don't know whether it would be possible to improve Firebirds ways of
handling such requests. Obviously, the example above is very simple and
it should be possible to make the optimizer perform better (see how I
think it theoretically could be solved below), but it doesn't take much
additional complication to at least confuse my brain. And I'm not able
to imagine neither the potential problems or benefits in making Firebird
do similar conversions.

HTH,
Set

(here's below)

select a.* from a
join b on a.id = b.id
where b.l = 'a'
and not exists(select * from b2
where b.id = b2.id
and b2.l = 'a'
and b2.rdb$db_key < b.rdb$db_key)

(I'm uncertain about how to use rdb$db_key, replace with primary key if
available. Also, I think such a request might actually turn out slower
than the original select in some circumstances)