Subject | Re: Performance of Firebird vs. other DBMS |
---|---|
Author | laurenz_brein |
Post date | 2005-08-17T08:36:19Z |
Adam wrote:
It is now clear to me why all SELECT COUNT(*) will be expensive,
even the ones that I thought would do an index scan.
I am afraid that I didn't make it clear enough that with a
'simple select' I meant a statement like
SELECT NAME FROM PARENT WHERE ID = 42
These queries also perform badly, when running them together
with the COUNT(*) queries 4 times as badly as on other DBMS.
However, when I run only these 'simple selects', the performance
is a mere 50% worse than that of the worst competitor.
I guess that the extra load the COUNT(*)-statements imply will
explain the factor 4, and the 50% may be accounted for by the
record versioning overhead, right?
Firstly, in what I call the 'index scan' queries, I DO have a where
clause involving an indexed column, and at least sometimes the
optimizer should figure that using the index could save time.
Secondly, one of the other databases I tested IS Postgres, and
a) all selects, even the COUNT(*) ones, perform reasonably well, and
b) the results of the 'index scan' queries was much better than for
the 'full table scan' queries.
This should not be if Postgres worked like Forebird, right?
but I am very opposed to introducing redundancy into a database.
It can lead to inconsistencies. This 'good technique' is what I
would prefer to call a kludge.
I am well aware how 'meaningful' the result of a test like this is.
That is why I struck up this discussion to gain insight.
BTW, only less than 1% of all the statements in my test where
COUNT(*) queries, which I thought was reasonable.
Isn't an index a B*-Tree, so that all you have to do when finding
the maximum is to descend from each parent node to the leaf node
that contains the largest values?
I have always thought that ascending and descending indexes only
behave differently (that is, efficiently or inefficiently) when it
comes to sorting the result.
For what I call an index scan, I thought it didn't matter as the
index can be used to seperate the high values from the low ones,
no matter whether it is ascending or descending.
I thought it's just a question of choosing the 'left road' or the
'right road' at each index node.
Yours, and thanks for all the effort,
Laurenz Albe
> Firstly, David was not joking when he implied that select count(*)Thank you and David for enlightening me.
> based queries is a possible weakness in any MGA database.
It is now clear to me why all SELECT COUNT(*) will be expensive,
even the ones that I thought would do an index scan.
I am afraid that I didn't make it clear enough that with a
'simple select' I meant a statement like
SELECT NAME FROM PARENT WHERE ID = 42
These queries also perform badly, when running them together
with the COUNT(*) queries 4 times as badly as on other DBMS.
However, when I run only these 'simple selects', the performance
is a mere 50% worse than that of the worst competitor.
I guess that the extra load the COUNT(*)-statements imply will
explain the factor 4, and the 50% may be accounted for by the
record versioning overhead, right?
> Firebird will not even read the indexThis is interesting.
> if there is not a where clause to knock out some records, because
> the index read would just be just another I/O read for no gain.
>
> So you don't get the impression it is a weird Firebird thing, it
> is a problem for Postgres as well.
>
> (http://archives.postgresql.org/pgsql-hackers/2005-01/msg00247.php)
Firstly, in what I call the 'index scan' queries, I DO have a where
clause involving an indexed column, and at least sometimes the
optimizer should figure that using the index could save time.
Secondly, one of the other databases I tested IS Postgres, and
a) all selects, even the COUNT(*) ones, perform reasonably well, and
b) the results of the 'index scan' queries was much better than for
the 'full table scan' queries.
This should not be if Postgres worked like Forebird, right?
> Next, there are some good techniques in a MGA system such asSorry to disagree, and I guess it is also a matter of taste,
> Firebird for improving the speed considerably. I wont repeat it,
> but it is right here
>
> (http://groups.yahoo.com/group/firebird-support/message/56457)
but I am very opposed to introducing redundancy into a database.
It can lead to inconsistencies. This 'good technique' is what I
would prefer to call a kludge.
> Thirdly, I would certainly hope that for you 50% of the requirementYes, you are absolutely right.
> of your database is count(*) related, because you are testing and
> basing your judgements (possibly unintentionally) on that parameter.
>
> If you run queries that actually have to fetch and return data,
> the results will be significantly closer, and if you run the
> inserts concurrently to the selects, you may find the readers and
> writers not blocking each other actually gives MGA a real edge.
I am well aware how 'meaningful' the result of a test like this is.
That is why I struck up this discussion to gain insight.
BTW, only less than 1% of all the statements in my test where
COUNT(*) queries, which I thought was reasonable.
> Fouthly (and finally), there is one difference in Firebird's indexSorry to be so dense, but I need an explanation for this.
> structure that you should be aware of. It is directional, not
> bi-directional. This allows a greater density, and can still help
> with most operations, but certainly not all. An index is ascending
> unless you specifically declare it to be descending. That means
> that a normal index on a field qty could not help you run the
> following
>
> select max(qty) from tablea
Isn't an index a B*-Tree, so that all you have to do when finding
the maximum is to descend from each parent node to the leaf node
that contains the largest values?
I have always thought that ascending and descending indexes only
behave differently (that is, efficiently or inefficiently) when it
comes to sorting the result.
For what I call an index scan, I thought it didn't matter as the
index can be used to seperate the high values from the low ones,
no matter whether it is ascending or descending.
I thought it's just a question of choosing the 'left road' or the
'right road' at each index node.
Yours, and thanks for all the effort,
Laurenz Albe