firebird-support - Re: [firebird-support] Re: Firebird 2.0.1: Database corrupt under high load CPU load

Subject	Re: [firebird-support] Re: Firebird 2.0.1: Database corrupt under high load CPU load
Author	Helen Borrie
Post date	2007-04-22T00:50:04Z

At 05:04 AM 22/04/2007, you wrote:

>After analysing my Delphi code the scenario is as follows:
>
>I'm doing a query to get a result set (using an IBQuery
>component).

Is this a query on committed records or on records that were just
inserted by the same transaction?

>Then I am iterating through my result set, reading and deleting each
>record using a SQL statement (using an IBSQL component, depending on
>the same transaction).

Is the user doing that selectively? or is the program using the
iterate-through-a-dataset approach to perform an operation that the
user doesn't see? If the latter, why?

>To delete a record I'm using the FGUID field (the GUIDs were
>generated by a Windows API call) and doing a CommitRetaining while
>the dataset is still open
>and in use => this corrupts the index.

Deleting from a dataset that is in use doesn't cause corruption, per
se: it is just a client-side snapshot of what was visible to the
transaction at the time the SELECT request was submitted. Even when
you are iterating through a dataset that is out-of-synch with the
transaction's current view of database state (as you do here) it
should give rise to an exception, not corrupt data....so something
*wrong* must be occurring, under certain conditions, in the
accounting that occurs when the server gets requests that change the
linkage between an index node and a data page.

>The settings are:
>read_committed
>rec_version
>wait
>
>Unfortunately this worked w/o problems except from the
>database corruption from time to time (sometimes weeks).

I think there *is* something unhealthy about using this combination
with your unsynchronised approach. You are operating on an
out-of-date dataset and your settings are saying "do this delete even
if you have to wait", whilst others are busy inserting and deleting
(by the same method) at the same time. The commitretaining means
that this table and the index are constantly "dirty" because garbage
and dirty index nodes have the potential to remain forever (or, at
least, until next time a manual sweep or a backup is done in
exclusive mode). Somewhere in this melee, a pointer is getting lost
sometimes.

It will strangle performance, as well, as the detritus builds up each
time CommitRetaining is called and the high-water mark for GC gets
pushed further and further back relative to the oncoming transactions.

Another element of risk I see with these client-side generated GUIDs
is the potential for duplication. Ann could explain far better than
I, what could happen when a new node is created on an index that is
still holding the stub of an old node that had the same value.

I'm still curious to know why you need to use a GUID to identify
these records...

FWIW, the Paradox-style approach of iterating through a dataset to
perform batch deletes does not belong in client/server. The only
situation where it might be justified is where the user is deleting
records selectively.

Consider:

With your method, you are pulling over a set that is filtered on the
criteria you want to isolate the records you want to delete. Then,
row-by-row, you are reading the (hopefully unique) GUID in order to
identify one record to delete (even though you really want to delete
*all* of the records that match your filter).

So -- if you have a set that is

select blah, etc. from atable
where ...whatever

...then you can write the statement

delete from atable
where ...whatever

...and delete them all as a batch (or not, if an exception occurs).

That's how you should do the batch deletes. You don't need the GUID for that.

./heLen