Subject Re: Any workarounds for performance issues with multi-generational architecture.
Author Adam
--- In firebird-support@yahoogroups.com, "bigloudjeff"
<bigloudjeff@y...> wrote:
>
> I hae just spent a couple of weeks writing a pseudo-DAO wrapper for
> firebird so that I could more easily migrate a legacy code base to
> firebird to overcome some JET database limitations. I finished the
> initial version and started doing some testing the results on some
> smaller/slower hardware and found some disappointing results.
>
> In order to emulate DAO's recordset functionallity, I have to do
> things like select * from table or select count(*) from table. The
> tables I am dealing with have ~100,000 rows. On my developemnt
> machine I did not notice any big performance hit. However, it seems
> that on a laptop with limited ram and a slower drive the
performance
> penalty of these types of statements is quite big.
Executing "select
> count(*) from table" takes about 6 minutes on the laptop I am
> testing on.
>
> After doing some research into this, it seems that Firebird's multi-
> generational architecture does not lend itself well to operations
on
> large result sets in limited resource situations. My understanding
> is that no matter how the select is performed (with or without an
> index) all of the records that could end up in the result set have
> to be visited directly to determine which one/version should be
> included in the count or the result set.

That is precisely the "problem". What do you mean by "count"? You
obviously mean how many records are in the table, but are we counting
uncommitted inserts? What about deleted records that are still
visible to someone in the database?

Firebird defines the count operation to be the number of records that
are visible to the transaction the count operation is run inside.
There has been talk before about including enough information in the
index to work out whether it is visible or not to your transaction,
but then the average size of the index would have grow enourmously
and every other operation would slow down as a result.

>
> Unfortunately, I am not able to make major changes to the legacy
app
> I am trying to convert. So, I need to be able do things like
finding
> out the number of rows in a table/recordset before fetching them
all.
>
> Are there any know workarounds for these scenarios?

There are two that I am aware of.

There is a query to get the "approximate" count. This includes
garbage yet to be collected, but if you need a rough idea rather than
precise numbers it is fine

http://www.fbtalk.net/viewtopic.php?id=164

The second methodology uses triggers to maintain a count table. Every
time you insert a record, you insert the value +1 in the count table.
Every time you delete a record, you insert the value -1 in the count
table.

Then create the following stored procedure.

CREATE PROCEDURE SP_UPDATECOUNTS
AS
DECLARE VARIABLE MYCOUNT INTEGER;
BEGIN
SELECT SUM(AMOUNT) FROM COUNTTABLE INTO :MYCOUNT;
DELETE FROM COUNTTABLE;
INSERT INTO COUNTTABLE(AMOUNT) VALUES (:MYCOUNT);
END

Then use iSQL in the scheduler to execute it at a certain frequency.
This approach works quite well, and because of transaction isolation
provided by MGA, it does not inhibit "writers" at all from doing
their job.

To get the count at any stage,

select sum(amount) from counttable.

The more frequently you run the stored procedure, the fewer records
will need to be added together. It is up to you to pick a good
compromise interval.

> Are there any configureation paramters other than cache and page
> size that effect this behavior?
>

Not really, you have already identified the root cause of the problem.

> Does Firebird 2.0 improve the performance of this scenario in any
> way?

I doubt it, if so it is probably an indirect performance increase in
the garbage collection that may also happen when you do a select count
(*) after a mass delete.

Adam