Subject | Re: [IB-Architect] OFF-TOPIC: Re: Query joins |
---|---|
Author | Jim Starkey |
Post date | 2001-04-29T16:00:22Z |
At 09:03 PM 4/29/01 +1000, Helen Borrie wrote:
has considered it a support problem. It all likelihood it is a
symptom an optimizer bug induced by a misunderstanding of the
basic architecture structure of the optimizer. I Ray Mond
wants a work around, support is indeed a place to go. If he
wants an explanation or even better, a fix, he's come to the
right place.
There are (at least) three strategies for optimizers, rule based,
cost based, and cop-out.
Rule based optimizers use a complex sets of rules to heuristically
determine an execution strategy. Cost base optimizers estimate the
cost of each alternative using table cardinalities and index
selectivities and pick the cheapest. Cop-out optimizers throw their
hands in the air and let the users pick or influence the decision.
Almost all neophyte database developers start with rule based
optimizers. The smart one discover the problem is too complex
to handle with rules and the dumb ones find another line of
work or transition to cop-out optimizers (the Borland strategy).
The firebird optimizer started as a pure cost based optimizer.
Sometime during V3 development a user discovered that although
it was able to optimizer a 17 way join to execute in about 30
seconds, the optimization process itself took 12.5 hours. It
was clearly time for me to get smarter. The fix was to add
heuristics to prune the search tree. This involved two changes.
First, only join orders that made sense were evaluated
(uncorrelated tables first, tables reached by uniques
second, etc). Second, at each stage an estimated cost for
be case completion for subbranches is computed, and if greater
than the current best case, the sub-tree is abandoned. This
brought the time for the 17 way join from 12.5 hours to a
few seconds without missing the best case.
There is a problem with cost based optimizers is that they
are only as good as the information available to them, and
that maintain accurate information is prohitively expense
(less so in the super-server architecture, but that is a
different question). There are lots of pathological cases
where the cost estimates don't extrapolate well to the actual
cost. But these are nothing like the hash that rule based
optimizers make of many simple situations. But I don't
think this is the problem.
Sometime during V4 somebody (who shall remain nameless) stuck
in a couple of rules itno to the cost based optimizer that just
plain broke it. It doesn't produce the wrong answer, which
got it by Borland QA department, it just didn't work very
well (which the Borland QA department could identify with).
Plans were introduced as a work around for the busted optimizer.
Plans are an admission of defeat. Pox of plans. Or more
accurately, pox of the need for plans. Are there any compilers
so lame that you have to tell it how to remove invariants from
loop?
It sounds like Ray Mond has a lovely test case to debug the
porker. Somebody should take it on and make the world safe
for queries. Gentlemen, ladies: Here is a chance for great
glory! Jump in, learn about streams, rivers, and tributaries
to become the Exalted Optimizer Guru!
Jim Starkey
>www.yahoogroups.com/community/IB-Architect. This question is off-topic.
>Please read the rules of this list at
>I respectively disagree. This is a problem only because Borland
has considered it a support problem. It all likelihood it is a
symptom an optimizer bug induced by a misunderstanding of the
basic architecture structure of the optimizer. I Ray Mond
wants a work around, support is indeed a place to go. If he
wants an explanation or even better, a fix, he's come to the
right place.
There are (at least) three strategies for optimizers, rule based,
cost based, and cop-out.
Rule based optimizers use a complex sets of rules to heuristically
determine an execution strategy. Cost base optimizers estimate the
cost of each alternative using table cardinalities and index
selectivities and pick the cheapest. Cop-out optimizers throw their
hands in the air and let the users pick or influence the decision.
Almost all neophyte database developers start with rule based
optimizers. The smart one discover the problem is too complex
to handle with rules and the dumb ones find another line of
work or transition to cop-out optimizers (the Borland strategy).
The firebird optimizer started as a pure cost based optimizer.
Sometime during V3 development a user discovered that although
it was able to optimizer a 17 way join to execute in about 30
seconds, the optimization process itself took 12.5 hours. It
was clearly time for me to get smarter. The fix was to add
heuristics to prune the search tree. This involved two changes.
First, only join orders that made sense were evaluated
(uncorrelated tables first, tables reached by uniques
second, etc). Second, at each stage an estimated cost for
be case completion for subbranches is computed, and if greater
than the current best case, the sub-tree is abandoned. This
brought the time for the 17 way join from 12.5 hours to a
few seconds without missing the best case.
There is a problem with cost based optimizers is that they
are only as good as the information available to them, and
that maintain accurate information is prohitively expense
(less so in the super-server architecture, but that is a
different question). There are lots of pathological cases
where the cost estimates don't extrapolate well to the actual
cost. But these are nothing like the hash that rule based
optimizers make of many simple situations. But I don't
think this is the problem.
Sometime during V4 somebody (who shall remain nameless) stuck
in a couple of rules itno to the cost based optimizer that just
plain broke it. It doesn't produce the wrong answer, which
got it by Borland QA department, it just didn't work very
well (which the Borland QA department could identify with).
Plans were introduced as a work around for the busted optimizer.
Plans are an admission of defeat. Pox of plans. Or more
accurately, pox of the need for plans. Are there any compilers
so lame that you have to tell it how to remove invariants from
loop?
It sounds like Ray Mond has a lovely test case to debug the
porker. Somebody should take it on and make the world safe
for queries. Gentlemen, ladies: Here is a chance for great
glory! Jump in, learn about streams, rivers, and tributaries
to become the Exalted Optimizer Guru!
Jim Starkey