Subject Re: Since I shouldn't use COUNT....
Author Adam
--- In, <myles@...> wrote:
> I have a database with a statistics table that has a record for each
time an
> object on my server is accessed. Its like an audit trail, and
contains the
> IP address that accessed it, etc. It has a foreign key reference to the
> object that the access was for (ie. It's a many to one relationship).
> I need to return the top 10 accessed objects for a given month. I am
> developing a stored procedure to do this, and I have it working (well
> sorta)...
> I have it so that it returns the list of objects accessed based on the
> number of rows in this table within a certain date range. I can
limit the
> returned values by using the DISTINCT and Limit capabilities of
Firebird so
> that I get 10 records, however I need it to return the following values:
> I have no problem cross referencing the ID of the statistics table's
> key reference to the object table, and returning information from
the object
> table (ie. Object_ID and Object_Name). However I want to also
return the
> number of stats records per object as well.
> I can't see any way of doing this without using the COUNT function, but
> according the Helen's incredible tome of knowledge on Firebird, it's
a big
> NO NO to use the COUNT function. Without returning every statistics
row to
> my PHP application that is going to format and display the values, what
> other way is there to return the number of rows that meet a criteria
in a
> select statement in Firebird?

Its not so much a no-no to use it, it is just not a cheap operation,
nowhere near as cheap as its locking database cousins which can answer
near immediately. There is not enough information in the index to know
whether the particular node is to be counted or not as it may refer to
a deleted record or uncommitted insert or update etc, or even a
committed value that your snapshot transaction can not see due to
transaction isolation.

You often see logic like

if [Count(*)] > 0 then .....

The problem is that you have just forced Firebird to go to each data
page when really you have no interest in the count, you only care
whether at least one exists.

A count may be required for a particular function (although it is
unusual to try a count on an entire table). There are some good tricks
for getting a fast approximate count if that is good enough, there are
also some tricks to maintaining a counter using triggers whilst
retaining the non-locking. It just requires a stored procedure to be
executed periodically to minimise the number or records to be summed.


> Here's my stored procedure so far (it's a work in progress)...
> begin
> /* Returns the top 10 collections for the current month */
> /* Get the current month for the search */
> /* Now find the top 10 statistics for visible collections */
> for
> select
> (select C.COL_NO from COLLECTION C where
> C.COL_NO = (select SND.COL_NO from SOUND SND where
> (select C.COL_NAME from COLLECTION C where
> C.COL_NO = (select SND.COL_NO from SOUND SND where
> from stats ST
> do
> begin
> suspend;
> end
> end
> I'm not limiting the number of returned rows or sorting them on
> just yet, but I want to add a COUNT statistic so that I know the
number of
> statistics records that were found for each collection.
> Any ideas?
> Myles
> ============================
> Myles Wakeham
> Director of Engineering
> Tech Solutions US, Inc.
> Scottsdale, Arizona USA
> Phone (480) 451-7440