Subject Re: Why does it take hours to perform a select count(*) on a table with 586,000
Author Adam
> I have a table with 586,000 records. When I try to get a record count
> using select count(*) from table, it takes hours. Why is this?

In a locking database, everyone gets the same count. This count can
simply jump through the index nodes and return the number, because
each index node refers to a record you want to count.

In a MGA database such as Firebird or Postgres, indices will contain
the following:

* Records you can see that you want to count
* Records you can see that have been deleted by another transaction
that you want to count
* Records you can not see because they were inserted by some other
transaction that was not committed at the time your transaction
started (non read committed type) or not committed yet (any type).
* Records that have been deleted but are yet to be removed by the
garbage collection process.

Some of these may be a bit hard to understand if you are new to MGA,
but in a nutshell, all you need to know is that the index contains
stuff your transaction should not count. Therefore, even after you
read the index, your transaction will visit the data page and
determine by which is the latest transaction that you can see whether
it is in fact counted or not.

Furthermore, if your transaction "notices" that the record is not
interesting to anyone at all, it falls on your to take out the
garbage. I believe in your case this is what is happenning. A count on
this many records should be measured in seconds not hours.

> How can
> I get a quick count?

Set has given you a link to a way of getting a 'guestimate' reasonably
quickly, but of course it contains deleted records etc so it is not exact.

If you want a fast, exact count, you need to track the count yourself
using triggers and a second table.

TableA
(
ID integer,
blah integer
);

TableACount
(
cnt Integer
);

Have a before insert trigger on TableA that does the following

insert into TableACount (cnt) values (1);

And a before delete trigger on TableA

insert into TableACount (cnt) values (-1);

Now add a stored procedure

CREATE PROCEDURE SP_SETTABLEACOUNT
AS
BEGIN
SELECT SUM(CNT)
FROM TABLEACOUNT
INTO :CNT;

DELETE FROM TTABLEACOUNT;
INSERT INTO TABLEACOUNT (CNT) VALUES (:CNT);
END

Using this mechanism, you have a MGA compliant count. When you want
the count, you can simply call:

SELECT SUM(CNT)
FROM TABLEACOUNT

You don't ever call the stored procedure, instead schedule it to be
executed every week/day/hour/whatever. Your program will still need to
add numbers to get a count, but providing the stored procedure is run
frequently enough, you will only be adding a few numbers together.

Adam