Subject Re: Optimisation Question
Author Svein Erling Tysvær
Hi Giulio!

Using NOT EXISTS (SELECT * FROM is generally a good option, although I
think Firebird has become better than InterBase was with NOT IN
(SELECT <field> FROM trying to translate most of them into a NOT
EXISTS. Maybe it was the DISTINCT that made things slow for you? In
the very near future I plan to run a program that will run a joined
query with several subselects (including a ... WHERE <field> = (SELECT
COUNT(DISTINCT <field>)...) more than 2000 times on tables with a few
hundred thousand records and I expect it to occupy that computer for
one or more days. Hence, 'dog slow' on tables your size when you
simply use a NOT EXISTS (or NOT IN without distinct) sounds strange
(mind you, I've never seen an animal run faster than the dog we used
to have), and I would guess that some indexes may be used
inappropriately. Post the query with the plan here, and we will see if
we can help.

Set

--- In firebird-support@yahoogroups.com, "donjules2k" wrote:
> Thank you very much for your input on this.
>
> Set, you now bring me to my next question. I have another query
> which does a NOT IN using a subquery. The outer table has about 2000
> rows and the inner table has about 32000.
>
> As you can imagine my query runs dog slow and I was going to ask
> what was the best way to optimise it. I tried using an inner join
> with != and much as I though, that was a crazy idea and would lead
> to the same if not longer execution time. I see you run a query
> like:
>
> select TABLE_A.field1
> from TABLE_A
> where not exists (SELECT * FROM TABLE_B WHERE TABLE_A.joinfield =
> TABLE_B.joinfield)
>
> Is this the best way to run such a query in firebird?
>
> Also, what are the plans for the next FB release and how high a
> priority is plan optimisation?
>
> Thanks again :-)
>
> Giulio
>
>
>
> --- In firebird-support@yahoogroups.com, Svein Erling Tysvær
> <svein.erling.tysvaer@k...> wrote:
> > --- In firebird-support@yahoogroups.com, "Arno Brinkman" wrote:
> > > Hi,
> > >
> > > > > With firebird, why is that this:
> > > > >
> > > > > select TABLE_A.field1
> > > > > from TABLE_A
> > > > > where TABLE_A.joinfield not in
> > > > > (select distinct joinfield from TABLE_B)
> > > > >
> > > > > runs much much much slower than
> > > > >
> > > > > SELECT TABLE_A.field1
> > > > > from TABLE_A
> > > > > inner JOIN TABLE_A on TABLE_A.joinfield = TABLE_B.joinfield
> > > > >
> > > > > which is lightning quick.
> > > >
> > > > The optimizer could rewrite the first query into the second,
> but
> > > > doesn't. Arno might explain why ;-)
> > >
> > > Well, certainly in this case they are complete different
queries?
> > > Would not be good if engine would rewrite query1 into query2 ;-)
> >
> > Maybe we could introduce a new SQL extention:
> >
> > select TABLE_A.field1
> > from TABLE_A
> > where TABLE_A.joinfield not in
> > (select distinct joinfield from TABLE_B)
> > USING UPSCENE OPTIMIZATION [WITHOUT BAMSEMUMS]
> >
> > which created such funny results?
> >
> > On a more serious note, why does the optimizer care about DISTINCT
> > when using subselects with [NOT] IN at all? Anyway, it is strange
> SQL
> > and I think I will continue using
> >
> > select TABLE_A.field1
> > from TABLE_A
> > where not exists (SELECT * FROM TABLE_B WHERE TABLE_A.joinfield =
> > TABLE_B.joinfield)
> >
> > for such queries.
> >
> > Set