Subject The age old 'count' issue...
Author Myles Wakeham
I'm having a bit of a challenge, and trying to re-think design, etc.
around this issue.

I have a web application that displays results in a grid format. Like
most web apps, I send down to the browser 'pages' of rows (ie. Page X of
Y), and let the user navigate forward or backwards through the rows.
Where possible I've tried to avoid this approach as I really have to
question if a user needs to see a list of results, rather than a better
search facility however in the case of chronological records (ie. Audit
Logs, etc.) there really isn't a way to give the user the visibility to
a series of events than to give them a list like this.

With that said, I ran into performance problems with data sets that were
100,000 rows or larger in storage. I access all data through stored
procedures, and in some cases my queries have 2-5 inner joins to other
tables. When I attempted to get all data, and then split it into pages
in PHP (my web app dev environment), its just too slow. I found out
quite a while back that the reason is that I was forcing the database to
return the entire result set to the PHP server app, so I changed my
design to allow the passing into the requesting stored procedure the
SKIP and no of rows returned values, and then dynamically build a query
to return that data set using FOR EXECUTE STATEMENT construct. This
works much better and I've seen performance increases of 50x as a result.

There is one last part of this optimization issue though and it has to
do with getting the total number of rows found in the query. I
understand that using COUNT(*) is a bad idea. But in order to know how
many pages of data the user can browse through, I'm completely at a loss
as to how to get this number dynamically. Note that the user has the
option to filter the list down to a specific search criteria, so I can't
store the total number of rows through a trigger and get it this way
because I won't know their search criteria. Basically what I need to be
able to do is issue the query twice in the stored procedure - once to
get the total number of rows it would return, and secondly to return
only the set of rows that makes sense for their query. But I'm trying
to do this without a performance impact.

Can it be done? Are there any tricks to do this sort of thing?

Thanks in advance for any suggestions.

Myles
--
-----------------------------
Myles Wakeham
Director of Engineering
Tech Solutions USA, Inc.
Scottsdale, Arizona USA
www.techsolusa.com
Phone +1-480-451-7440