Subject Denormalized stats maintenance (was Features implementation request / plans)
Author Jerome Bouvattier
Hi Lester,

> > Although I am trying to avoid using it, select count(*) from table,
> > should really be improved :-(
> > Sometimes my customers would like to have an idea of the database
> > size, and this beast must run. For now, I just update a table with
> > the record count during the update process, and instead of running
> > the query, I display the correct field from that database.
>
> That IS the correct way to do it. The count is only valid for the
> transaction that accesses it, and only you know how you are counting
> records. As well as a total count ( or just SUM sub totals ) it is very
> useful to keep sub-total counts for different groups of records, and
> only you know how to do that :)

I need to implement such a system. I'd appreciate your insights.

I will need to keep stats (count but also min, max, our_internal_function)
for "group of records" as you said.
My "groups of records" contain quite volatile data. I expect to see at most
1000 insert/deletes per hour and per "group".
As you can guess, the stats mentionned require different calculation
resources. e.g a Count stat only implies increment/decrement to be
maintained, while some of our_internal_functions will imply a lookup on the
whole "group" each time.

Here are my questions :

- Should I worry about locking issues ?
- Would a scheduled maintenance of those stats be better than "on the fly"
maintenance then ?
- In the latter instance, how to flag/determine the remaining work that the
maintenance job needs to take care of ?

Any other advice, trick will be well appreciated.

Thanks in advance.

--
Jerome