Subject Re: [firebird-support] Natural Plan when grouping by 5 indexed columns?
Author Richard Wesley
Sean -

On Jan 31, 2007, at 19:09, Leyne, Sean wrote:

> Since you haven't specified any WHERE criteria, you are asking for the
> complete table.

Not quite - I am asking for the unique combinations of dimension
values, which can be obtained from the indecies. The query returns
1932 rows, which is less than 5% of the rows in the table. Plus, the
five columns of this dimension are not as wide as the entire table.

> That being the case, it is always faster to perform a natural scan and
> sort rather than walk an index.
> Why is it faster?
> Because the index will be sorted in value order, and not page order.
> This means that while walking the index the engine would cause the
> HD to
> thrash as the engine reads the appropriate data page for each
> record in
> the index.

This is why I asked if hierarchal levels benefit from composite
indecies - I was wondering if this was a clue to the optimiser that
this is a hierarchy. I understand that if we had to read every row
in random order, it would be a performance nightmare, but we actually
only need to read < 5% of the row data, and IIRC that data is ALL in
the indecies. But as Adam points out, this is an unusual case and
maybe the optimiser does not notice it. And that is a fine answer.

> Why does it read the records?
> Because Firebird is a multi-generational engine, which means that the
> indexes contain data which may be 'dirty', the engine needs to read
> the
> records to confirm the data values (for this transaction).

Jet was simply the engine I would have expected it to beat, although
I appreciate that it does not have the transactional overhead that FB
has. But MySQL 4/5, Postrges 8, SQL Server 2000/2005 and Oracle 9i/
10g also outperform FB2 by a significant margin on this one test. The
Postgres server is running on the same machine, so it is a tolerably
fair comparison.

Please understand: I appreciate that all databases have strengths
and weaknesses - I am just trying to understand why FB does
relatively poorly on this particular test. I have almost 1000 other
tests where it is quite competitive, but I make a point of going
through our results looking for weak points in our query generation.
FB has a small (<10) number of tests where its performance sticks out
like a sore thumb and I'm just trying to understand what is going on
and whether I can do anything about it.

And just to assuage any wounded partisan feelings: while Postgres
actually performs the best on this particular test, it has its own
set of outliers that I need to investigate.

Best regards,
Richard Wesley Senior Software Developer Tableau