Subject | Re: Index not used by "where xxx not null" query? |
---|---|
Author | Svein Erling Tysvær |
Post date | 2006-02-28T08:56:42Z |
Hi Kjell!
I guess I am lucky or unlucky to only have to deal with Firebird, but
in general I find most queries to perform excellent without much
thinking about performance. Normally, I just prepare a query, take a
quick glance at the plan (check that NATURAL is at worst on the first
table in the plan, and that most tables thereafter are linked through
one index only) and that's it. Sometimes I have to do minor
modifications (normally add +0 or something to avoid use of certain
indexes), but not too often. And through this list, I've found that
problems with performance have become less common as Firebird has
progressed. One thing that I know for sure has changed, is that
writing 'IN <subselect>' in the where clause, will be transformed to
'EXISTS <subselect>' by the optimizer in many cases (with the
subselect executed for every row, 'IN <subselect>' was a show stopper
that I completely abandoned myself). I don't know if the optimizer
does similar things with transforming 'NOT < <value>' into '>=
<value>', but it could do and the only thing that would consistently
break was those situations where people have used 'NOT <' in order to
avoid an index (we normally recommend +0 or ||'' for that purpose, so
it may not be too big a problem).
Since you experience lots of trouble with queries and I hardly any,
maybe there are other things that differ? E.g. I use surrogate keys
consisting of one single integer in virtually all tables (at least, I
never have two fields in the primary key), and my indexes consist of
one field only (here there is an exception in that I add the primary
key at the end of the index for non-selective fields).
I only use LEFT JOIN where necessary, INNER JOINs are considerably
less complicated for the optimizer. Though of course, there are
situations where LEFT JOIN is the answer to the current problem. And,
of course, this also means that I use JOIN as introduced in SQL-92
rather than joining in the WHERE clause.
Once upon a time, InterBase (or its predecessor) chose the ideal plan
every time. Though there was a small problem. Even though the
execution was done in seconds, complex queries took many hours to
prepare (I wasn't around at that time, but I think Ann mentioned it
long ago).
Set
-friend of the optimizer
I guess I am lucky or unlucky to only have to deal with Firebird, but
in general I find most queries to perform excellent without much
thinking about performance. Normally, I just prepare a query, take a
quick glance at the plan (check that NATURAL is at worst on the first
table in the plan, and that most tables thereafter are linked through
one index only) and that's it. Sometimes I have to do minor
modifications (normally add +0 or something to avoid use of certain
indexes), but not too often. And through this list, I've found that
problems with performance have become less common as Firebird has
progressed. One thing that I know for sure has changed, is that
writing 'IN <subselect>' in the where clause, will be transformed to
'EXISTS <subselect>' by the optimizer in many cases (with the
subselect executed for every row, 'IN <subselect>' was a show stopper
that I completely abandoned myself). I don't know if the optimizer
does similar things with transforming 'NOT < <value>' into '>=
<value>', but it could do and the only thing that would consistently
break was those situations where people have used 'NOT <' in order to
avoid an index (we normally recommend +0 or ||'' for that purpose, so
it may not be too big a problem).
Since you experience lots of trouble with queries and I hardly any,
maybe there are other things that differ? E.g. I use surrogate keys
consisting of one single integer in virtually all tables (at least, I
never have two fields in the primary key), and my indexes consist of
one field only (here there is an exception in that I add the primary
key at the end of the index for non-selective fields).
I only use LEFT JOIN where necessary, INNER JOINs are considerably
less complicated for the optimizer. Though of course, there are
situations where LEFT JOIN is the answer to the current problem. And,
of course, this also means that I use JOIN as introduced in SQL-92
rather than joining in the WHERE clause.
Once upon a time, InterBase (or its predecessor) chose the ideal plan
every time. Though there was a small problem. Even though the
execution was done in seconds, complex queries took many hours to
prepare (I wasn't around at that time, but I think Ann mentioned it
long ago).
Set
-friend of the optimizer
--- In firebird-support@yahoogroups.com, Kjell Rilbe wrote:
>
> Ann W. Harrison wrote:
>
> > Err, _not_ quite. In fact the key is the "_not_". Since the
> > beginning of (its) InterBase / Firebird have _not_ used
> > indexes with _NOT_ conditions. Whether (or _not_) that's
> > appropriate with expressions like NOT NULL is debatable.
> > These are indexable:
> >
> > where x.a = 1
> > where x.a > 10
> > where x.a <= 10
> > where x.a IS NULL
> > where x.a between 1 and 10
> >
> > These are _not_
> >
> > where x.a NOT = 1
> > where x.a NOT > 10
> > where x.a NOT <= 10
> > where x.a IS NOT NULL
> > where x.a NOT between 1 and 10
>
> Thanks for the crystal clear explanation Ann!
>
> <rant>
> This is yet another example of all those peculiarities you have to
> be aware of when using Firebird.
>
> When I write a query without really thinking hard about how to
> construct it, what indices I *must* have, etc., to get *decent*
> performance, I usually get *decent* results with SQL Server. With
> Firebird on the other hand, I often get ridiculously bad
> performance.
>
> Sure, the queries, indices etc. in Firbird can be tuned to match SQL
> Server in most cases, and even outperform it from time to time. But
> why, oh why, do I always have to spend so much time tinkering with
> the Firebird queries to get "decent or better" performance?
>
> I know all that about tradeoffs and that Firebird has its main focus
> on handling thousands of concurrent users without causing locking
> problems, but does this really *have* to imply that the query
> optimizer has to fail to make the right decisions in so many
> situations?
> </rant>