firebird-support - Re: [firebird-support] Firebird 2.5: first insert into table takes ages to complete

Subject	Re: [firebird-support] Firebird 2.5: first insert into table takes ages to complete
Author	Helen Borrie
Post date	2018-10-11T09:23:33Z

Dominik,

>> Which sub-release?

> The latest, i.e. 2.5.8.27089

OK.

>> Yes. The symptoms suggest you have a large number of record versions
>> that are waiting for garbage collection. New records cannot be
>> written to existing pages until after that garbage has been cleared.

> This makes sense. We have applied migrations that add columns to
> large tables. This certainly caused every record in that table to be rewritten
> at least once for each new column that was added. Since the added
> columns are also not null, a subsequent update rewrites all records again.

> I would
> however assume that garbage collection is done at that point in
> time when records are updated. This is most probably not the case, allowing
> garbage to accumulate and create a large spike of garbage
> collection work that finally is done at one point in time. In our case that one point in time
> is an insert statement that then takes a minute to complete.

Yes, it looks that way.

>> See above. After a backup and restore, there is no garbage. However,
>> the first operation on a dirty table will cause a garbage collection -
>> hence the long time taken for this first insert.

> So the behavior stems from the design decision that garbage is not collected
> at the end of a statement or transaction.

Garbage is NEVER collected at the end of a statement. It cannot be,
since the new records or record versions are not part of the database
until the transaction is committed.

The garbage is not collected at the end of the transaction, either.
It will be *available* to be flagged for collection once the
transaction is no longer "interesting". After that, it will be
collected in one of two ways: cooperatively or by a background
thread. (Note that background GC is available only on Superserver.)
By default, both cooperative and background GC are configured
(GcPolicy = combined). It is the cooperative GC that is holding up
your first hit on that table.

> It appears to me that this is something
> that we have to live with for the time being, is it not?

Not necessarily. You can configure the GcPolicy parameter in
firebird.conf to 'background' and avoid that hit on the next access to
the table after the last commit. Don't forget, the change won't take
effect until you restart the server.

>> Run gstat -h on the database when you start to notice these delays.
>> Check the values of the various ' ... Transaction' reports and copy
>> them back here.

> There’s nothing unusual there.

> Oldest transaction 3270
> Oldest active 4313
> Oldest snapshot 4313
> Next transaction 4314
> Bumped transaction 1

> As you can see there are no stuck transactions and only very few transactions that are still
> missing a hard commit. The total transactions ran on the database are also very few – caused
> by a recent backup restore as already mentioned before.

This looks like a freshly restored database that has (had) work from
up to 43 transactions that are still "interesting". It could be just
one transaction: GC won't touch any record versions from transactions
that are more recent than the Oldest (interesting) transaction. In my
comment above, I have deduced that you holdup is from a coop GC
preceding the actual work of your insert request.

> Sweep has
> not run yet because oldest snapshot - oldest transaction < sweep interval:

> Sweep interval: 20000

Sweep and GC are not the same thing. GC goes on all the time. An
auto sweep might never occur: if the gap never reaches the "high water
mark" set by the sweep interval, there will never be an auto sweep.

>> As to the cause, it is totally due to inadequate management of
>> transactions. This style of poor management commonly comes from
>> applications that keep read-write transaction open for long periods
>> and never committing them. Are your apps written in Delphi?

> Yes, there are some. However those applications have not connected to the database in question
> since the last backup/restore.

Another thing that points to coop GC being the source of your delay.
See how you go with GcPolicy set to 'background'.

HB

---
This email has been checked for viruses by AVG.
https://www.avg.com