firebird-support - Re: Is there a way to optimize a count()? or am i doing it the wrong way?

Subject	Re: Is there a way to optimize a count()? or am i doing it the wrong way?
Author	Adam
Post date	2006-08-08T23:53:28Z

--- In firebird-support@yahoogroups.com, "Fabio Gomes" <fabioxgn@...>
wrote:

>
> I have this query:
>
> SELECT count(I.vnd_cod)
> FROM items_venda I
> JOIN vendas N ON (N.vnd_cod=I.vnd_cod)
> JOIN clientes C ON (N.cli_cod=C.cli_cod)
> JOIN vendedores V ON (N.ved_cod=V.ved_cod)
> WHERE I.pro_cod='1'
>
> I m using it on php, so this code is just for the pagination, the main
> query will just select 20 rows, no more than it. but i need a count of
> all the rows so i can know how many rows the query returned, and pass
> it to my pagination function.
>
> We have a slow server here (P4 2.0ghz 512MB) this query takes about
> 20+ seconds to return and uses almost 100% of the cpu to do it.
>
> So, is there a way to optmize this count() function?
>
> I m asking it cause currently we have just one company using this
> application, but my boss said that almost sold it to another company
> that have a lot more data then the current one, so i m beggining to
> worry about the performance of some stuff, cause i use this in almost
> all reports, this one is the worst in performance, but wating 20
> seconds to show up 20 rows is too much!

You are right, that is way too slow.

Count is an expensive operation (just as expensive in prepare and
execution as selecting a field, quicker on fetch obviously). I think
your problem is most likely to do with garbage collection rather than
the count per se. You are probably 'shooting the messenger' by blaming
count. If you remove the count, it will take 20+ seconds to return the
rows in the select itself, because instead of the count cleaning up
the garbage, your select is.

Some Firebird 2 index improvements will improve this, but in the
meantime you must avoid creating an index on a table with lots of
duplicates. Such an index costs a lot of CPU time to maintain when
removing a record, which is what garbage collection largely does.

But lets take a look at your query anyway to make sure it is not just
using a poor index. How long does the following query take?

SELECT I.vnd_cod
FROM items_venda I
JOIN vendas N ON (N.vnd_cod=I.vnd_cod)
JOIN clientes C ON (N.cli_cod=C.cli_cod)
JOIN vendedores V ON (N.ved_cod=V.ved_cod)
WHERE I.pro_cod='1'

I assume there is an index that can be used to resolve your where
clause, and an index that can be used for the joins.

Adam