Subject Re: [firebird-support] Re: Is there a way to optimize a count()? or am i doing it the wrong way?
Author Fabio Gomes
Hey guys... given the interest in this thread, i started to worry if the
problem was really the count operation, so i just made a test with 3 querys.

The second query is a subtotal that sums all values before the current page,
in this test i clicked on the last page, so if i have 10000 records, the
second query had to sum 9980 records i forgot this SUM() stuff and was just
thinking that the problem was the count, and it wasnt.

So, the count isnt the problem alone, i think it was slow, took 2 seconds in
a database with less than 10K records, but the real bad guy in this whole
thing was me and my SUM stuff, it took 4 seconds.

So now i know why this is slow:

it took:
2 seconds to run the cound
4 seconds to sum the values
and 1 second to select the 20 records.

= 7 seconds to make the queries...

if you sum this to the network delay and the time php takes to parse my page
contents you will get the 10~20 seconds i was talking about.

and the cpu usage is due to the SUM() operation, not the count.

Sorry guys for all the fuss, and thanx for your help.


On 8/10/06, Svein Erling Tysvaer <svein.erling.tysvaer@...>
> You are probably right Adam, in that the count is not the culprit. But I
> think Fabio joins four tables, count x rows and then select & display 20
> of them at a time. I don't know how many digits there are in 'x' and
> agree that the count by itself cannot be the cause if it is a reasonably
> small number (whether that small number is 20 or 10000). But using a
> count for PAGINATION sounds like a bad idea unless you are certain that
> the number is limited and that your select statement is not all too
> complicated.
> I almost wish it had been possible to write something like COUNT(*, 500)
> to count up to 500 and return 500 if more than 500 was available. Though
> I don't know if I'd ever use it if it was available ;o).
> Set
> Adam wrote:
> >> every case to be a facet of "you can't count on an MGA system."
> >> I would prefer to say: "You shouldn't COUNT on an MGA system"
> >>
> >
> > I wouldn't go that far. I would say that 'COUNTING is an expensive
> > operation on a MGA system and not a substitute for an exists check'.
> >
> > I don't think the problem here is the count itself, it is a bit of a
> > 'red herring'.
> >
> > Whilst getting a count of the number of records is more expensive than
> > in a locking database, I think everyone here would agree that 20
> > seconds to return a count of 20, ** SOMETHING ELSE ** is causing the
> > slowness.
> >
> > I would expect that retrieving a count is less expensive than the
> > corresponding select statement, because there is less fetch time, but
> > all else being equal, the prepare and execution time are identical.
> >
> > As a test I just counted 298700 records from a test database I have
> > loaded. It took
> >
> > Prepare : 1 ms
> > Execution : 446 ms
> > Fetch : 9 ms
> > -------------------
> > Total : 456 ms
> >
> > This was a simple select count(*) query with no joins. There is no
> > garbage in this table because it was recently restored. From a locking
> > database I would expect sub 20ms response because it does not have the
> > expensive execution time of a MGA database.
> >
> > When I add a join to another table with around 135000 records, the
> > execution time blows out to 2.2 seconds.
> >
> > Clearly, the problem is not with the count performance in Firebird if
> > it is taking 20 seconds to count 20 records. In a straight count,
> > Firebird is capable of 600,000 records per second on my laptop. With a
> > join involved it is still managing over 125,000 records per second.
> >
> > The problem here is most likely garbage.
> >
> > Adam

[Non-text portions of this message have been removed]