Subject Re: Subqueries and indexes
Author yoav112003
INNER JOIN seems to be a lot better indeed.

Thanks.

--- In firebird-support@yahoogroups.com, Svein Erling Tysvaer
<svein.erling.tysvaer@...> wrote:
>
> yoav112003 wrote:
> > Shouldn't the 'WHERE events.ROWID IN ...' clause be a hint to use the
> > index on this field?
>
> It is, but which place in the index should it look at? Using IN
> (subselect) means that Firebird goes through EVERY row of EVENTS,
> calculates the subselect and sees if the result matches ROWID.
>
> events.ROWID IN (5, 7, 9) can just look up 5, 7, and 9 in the rowid
> index, Firebird doesn't know how to do the same with IN (subselect). In
> general, subselects are executed for every potential row - even though
> it in some cases could have been executed only once. Think of the case
>
> SELECT distinct ROWID FROM events
> WHERE events.ROWID IN (
> SELECT DESTINATION.EVENTID FROM DESTINATION
> WHERE ( (DESTINATION.IP = events.IP))
> )
>
> There is no way to use an index when the values aren't fixed in one way
> or the other.
>
> An inner join can be a lot better. Another option would be for you to
> use a stored procedure with two FOR SELECT loops. First get the
> DESTINATION.EVENTID and then events.ROWID.
>
> HTH,
> Set
>
> > As you can read in my reply to Milan, I tried to use a custom plan
> > with this query which used this index, but I got an error message.
> >
> > The query suggested by Milan and Helen have the same performance as
> > the my query, my guess is because they too require a full table scan
> > on the events table.
> >
> > It seems that the best solution is to use an INNER JOIN query if
> > possible, but I am still curious to know the reason the ROWID index
> > cannot be used.
> >
> > --- In firebird-support@yahoogroups.com, Svein Erling Tysvaer
> > <svein.erling.tysvaer@> wrote:
> >> Helen and Milan helped you get rid of the irrelevant
IDX_DESTINATION2
> >> index, but didn't answer your WHY question.
> >>
> >> Let's pretend we're 'optimizers' who should find a plan for the main
> >> part of your query.
> >>
> >> First, how do we limit the rows? Ah, EVENTS.ROWID does that and that
> >> field has an index! What should it be? Hmm, it should have one of
the
> >> values in a subselect... Well, I'm not clever enough to know if the
> >> subselect is constant or varies for each row, hence I cannot use an
> >> index for that requirement. No further criteria for EVENTS? OK,
I'll go
> >> NATURAL then. (I skipped the bit with why it doesn't use the
index for
> >> the DISTINCT - I think the reason for the subselect, but not the
main
> >> select using an INDEX for the DISTINCT, is that the subselect can
> > create
> >> a bitmap of the EVENTID and IP indexes, whereas combining NATURAL
with
> >> any index is nonsense.)
> >>
> >> Now, I don't know how clever the optimizer has to be to meet the
> > 'clever
> >> enough' criteria. Clearly, in your case it could find the
> >> DESTINATION.EVENTID "subresult set" only once, but (sub)queries
can be
> >> much more complicated and Firebird has to handle every possible
> >> situation. Currently it doesn't recognize whether subselects are
> >> completely independent from the main select (and could be converted
> > into
> >> something like a join), I have no idea of whether it will be able
to do
> >> this in the future.
> >>
> >> Set
> >> -friend of the optimizer
> >>
> >> yoav112003 wrote:
> >>> I am trying to run the following query (FB 2 RC3):
> >>>
> >>> SELECT distinct ROWID FROM events
> >>> WHERE events.ROWID IN (
> >>>
> >>> SELECT DISTINCT DESTINATION.EVENTID FROM DESTINATION
> >>> WHERE ( (DESTINATION.IP ='158.125.73.220'))
> >>> )
> >>>
> >>> The default plan that is used is:
> >>> PLAN SORT ((DESTINATION INDEX (IDX_DESTINATION2, IDX_DESTINATION1)))
> >>> PLAN SORT ((SEAM_EVENT NATURAL))
> >>>
> >>> where IDX_DESTINATION2, IDX_DESTINATION1 are indexes on
> >>> DESTINATION.EVENTID and DESTINATION.IP respectively.
> >>>
> >>> This plan is very ineffecient since the plan does not rely on the
> >>> index for events.ROWID. When I try to use a customized plan with
this
> >>> index, FB replies that the index cannot be used in this query.
> >>>
> >>> Is there anything else that can be done (except for using INNER
JOIN)?
> >>> Why can't FB use the indexes on the events table?
> >>>
> >>> Thanks.
>