Subject | Re: Query plan not using index |
---|---|
Author | Svein Erling Tysvær |
Post date | 2005-02-12T23:53:12Z |
Hi again Kevin!
I'll answer your questions in the opposite order, primarily since I'm
more certain about the answer.
LEFT JOIN is equivalent to LEFT OUTER JOIN - i.e. get all records from
the left table whether or not they match any records from the right
table. This forces the optimizer to deal with the left table before
the right table in the plan. Though if you include fields from the
right table in your where clause this may limit the entire result set.
JOIN is equivalent to INNER JOIN, i.e. only return records if there is
a match for the join clause. The optimizer generally has a bigger
choice when it determines the plan and the result is normally
different (and often faster).
Yes, I think the subselect will be run for every row that meets the
other criteria in the where clause. For my needs, NOT EXISTS has
generally been sufficient (I generally work with tables up to a few
million records) and I haven't used LEFT JOINS at all for this
purpose. If a NOT EXISTS is generally slower than using a left outer
join (I knew that inner joins were better, but am surprised about your
left outer join being that quick), then I may have to adjust how I
write that kind of queries in the future (speed and clarity/simplicity
of code are the main factors for writing queries, I think). Also, I'm
a fossile still working with Firebird 1.0.3 and haven't explored the
benefits of Firebird 1.5.
But, are you sure that your LEFT OUTER JOIN actually returns the
entire result set quicker or just the first few records (i.e. is the
difference between ORDER BY or not due to Firebird being able to
return the record immediately when finding one row without an order
by, whereas it needs to find all rows before returning anything with
an order by)? You could of course try order by d.doc_category+0,
though I somehow doubt that would help too much.
Anyway, I hope you manage to make your query execute faster on 1.0.3,
Set
I'll answer your questions in the opposite order, primarily since I'm
more certain about the answer.
LEFT JOIN is equivalent to LEFT OUTER JOIN - i.e. get all records from
the left table whether or not they match any records from the right
table. This forces the optimizer to deal with the left table before
the right table in the plan. Though if you include fields from the
right table in your where clause this may limit the entire result set.
JOIN is equivalent to INNER JOIN, i.e. only return records if there is
a match for the join clause. The optimizer generally has a bigger
choice when it determines the plan and the result is normally
different (and often faster).
Yes, I think the subselect will be run for every row that meets the
other criteria in the where clause. For my needs, NOT EXISTS has
generally been sufficient (I generally work with tables up to a few
million records) and I haven't used LEFT JOINS at all for this
purpose. If a NOT EXISTS is generally slower than using a left outer
join (I knew that inner joins were better, but am surprised about your
left outer join being that quick), then I may have to adjust how I
write that kind of queries in the future (speed and clarity/simplicity
of code are the main factors for writing queries, I think). Also, I'm
a fossile still working with Firebird 1.0.3 and haven't explored the
benefits of Firebird 1.5.
But, are you sure that your LEFT OUTER JOIN actually returns the
entire result set quicker or just the first few records (i.e. is the
difference between ORDER BY or not due to Firebird being able to
return the record immediately when finding one row without an order
by, whereas it needs to find all rows before returning anything with
an order by)? You could of course try order by d.doc_category+0,
though I somehow doubt that would help too much.
Anyway, I hope you manage to make your query execute faster on 1.0.3,
Set
--- In firebird-support@yahoogroups.com, Kevin Herrmann wrote:
> Set,
>
> Thanks for the reply. I should've added that on firebird 1.5.2, the
> optimizer correctly identifies the index for use by the 'order by'.
> It takes sub-second (compared to 45 or so on 1.0.3).
>
> I see all your points; I'm naturally averse to subselects! Your
> query seemed to produce same results, but did take longer than when
> 1.5.2 used the index. Isn't the subselect going to be run for every
> row that 'hits' in the main query? Won't the join syntax be
> generally faster (given the index being used, of course)?
>
> On the 'LEFT JOIN' point, isn't specifying 'LEFT' just syntactically
> complete? i.e. without it, it's a presumed 'LEFT' join, right ?
>
> Thanks very much for the help!
> Kevin
>
> > Subject: Re: Query plan not using index
> >
> >
> > Hi Kevin!
> >
> > --- In firebird-support@yahoogroups.com, "Kevin Herrmann" wrote:
> > > This query is horribly slow on Firebird 1.0.3 UNTIL I remove the
> > > 'order by'. I think I've included all things I need.
> >
> > Well, maybe you've included all things you need, but not
> > everything I need - you forgot the plan when you remove ORDER BY.
> >
> > > Here is the relevant Info:
> > >
> > > select d.*, e.NAME, e.GIVENNAME, a.EVENTTYPE from documents d
> > > left join entity e on d.subject_ID=e.id
> > > left join auditmessage a on d.id = a.id1 and a.eventtype = 19
> > > where d.doc_type in (5,6,7)
> > > and e.status=0
> > > and e.name <> 'MACRO'
> > > and e.AttendingID=2
> > > and a.ID1 is null
> > > order by d.doc_category;
> >
> > Hmm, strange query. Why select a.EVENTTYPE when it will always be
> > null? Later on you demand that a.ID1 is null, something which
> > means (in my eyes, you better check) that your query can be
> > rewritten as:
> >
> > select d.*, e.NAME, e.GIVENNAME from documents d
> > join entity e on d.subject_ID=e.id
> > where d.doc_type in (5,6,7)
> > and e.status=0
> > and e.name <> 'MACRO'
> > and e.AttendingID=2
> > and not exists(select * from auditmessage a
> > where d.id = a.id1 and a.eventtype = 19)
> > order by d.doc_category;
> >
> > As you can see, I also removed 'left' from entity, I don't think
> > there should be any difference in the result set between a left
> > [outer] join and an [inner] join in this case (again, I may be
> > wrong).
> >
> > > There is an index on doc_category, but the optimizer doesn't
> > > want to use it. Here is the Plan:
> >
> > Well, why should it? The return set isn't limited to documents,
> > and the result set has already been limited from that table due
> > to the doc_type being between 5 and 7. But you do limit the
> > choices of the optimizer by using left join rather than inner
> > joins.
> >
> > HTH,
> > Set