Subject Re: No index used for join on 'starting with'
Author Dmitry Yemanov
11.04.2012 19:09, unordained wrote:

> a) for every row in A (regardless of order), do a PK lookup in B [which,
> ignoring index-caching, or some implementation tweak to remember where
> in the
> index you "last were" as a best-guess for the next lookup, is wasteful,
> as it would restart from the root of the index each time]

This is known as a nested loop join. And "restart from the root" costs
just a couple of page reads that are likely to be satisfied using the
cache. Not something really wasteful.

> b) indexed join, as I was once forced to implement by hand in Cobol,
> using two
> pre-sorted streams, keeping track of which one is "ahead" and advancing
> the one
> that is "behind" (I do not at all remember what that's called in theory,
> I'm sure it has a nice name.)

It depends on what is hidden behind "pre-sorted". There are two options:

1) Streams are read naturally and re-sorted by the join key, this is
what's known as sort/merge join in Firebird. The extra sorting cost
makes this option more expensive than the nested loop join. Firebird
defaults to this option only if there is nothing better available.

2) Streams are read in the index order that allows to avoid an extra
sorting. But navigating the whole tables in the index order will cause
extremely random I/O (at least for the FK driven table) so it's also
going to be expensive. Firebird never chooses such an option at all.

> I assume (b) is what you're saying FB3 does for full-outer-joins (yay!
> [that
> explains why I've sometimes been forced to do unions of left, inner, and
> right
> joins]), but why wouldn't (b) be a valid strategy for other join types too?

It would for left/right joins and perhaps could for full outer joins.
Not supported at the moment though.

> (That's what I was trying to convince FB to do with a PLAN. I never use
> explicit PLANs, is there syntactic magic to it?)

It rejects everything it doesn't support.

> Still, this doesn't explain why I do see it sometimes using an index to
> support a starts-with join, and he doesn't.

I can hardly guess without a test case.


Dmitry