Subject Re: [firebird-support] UPDATE to same record causing heavy disk I/O
Author Ann Harrison
On Tue, Dec 22, 2015 at 2:17 PM, Caroline Beltran caroline.d.beltran@... [firebird-support] <firebird-support@yahoogroups.com> wrote:


Since I began using Firebird, I have kept my transactions (type concurrency) very short and then call COMMIT immediately afterward.  This has worked very well.

Good, but that too can be overdone.  Each transaction start causes a change to the header page and end causes changed pages including a transaction inventory page to be written to disk.  There's some grouping of writes, but as a rule, think that each transaction you create causes two extra page writes beyond the data and indexes.  

I recently had the need to perform more complex processing and what I did was to keep everything short and modular.  I am updating different parts of the same record repeatedly and I believe that this is causing multiple back versions which causing excessive disk write I/O and slowing things down terribly:

a) begin a transaction, update FIELD_1 of MYTABLE, COMMIT, and end transaction.
b) begin a transaction, update FIELD_2 of MYTABLE, COMMIT, and end transaction.
c) begin a transaction, update FIELD_3 of MYTABLE, COMMIT, and end transaction.
d) begin a transaction, update FIELD_4 of MYTABLE, COMMIT, and end transaction.
e) begin a transaction, update FIELD_5 of MYTABLE, COMMIT, and end transaction.

There are several problems with this.  One is the significant transaction overhead you introduce. A second, as you've guessed is that you're creating a back version for each update.   Another is that any transaction reading your record between updates will see some field that have been changed and others that haven't.  Another, and not insignificant, is the danger that some other transaction will change your part or all of a record between your transactions, leaving the record inconsistent.
 
Of course, I normally update all fields in one transaction but in this particular case, determining the contents of each field is a complex process that requires manipulation and analysis of the the data provided by a number of other Firebird SELECT queries to the database.

I am averaging about 300 transactions per minute during this process that may last 12 hours and during that time, things get terribly slow.

 Probably some information from the monitoring tables will let someone else give you good advice.  

So can someone confirm my suspicions, will each of the 5 transactions above to the same row of data cause 5 new 'back versions'?

Absolutely. 

Like I said, I have always kept transactions very short.  I am thinking of something like this instead:

a) begin a transaction, update FIELD_1 of MYTABLE.
b) update FIELD_2 of MYTABLE.
c) update FIELD_3 of MYTABLE.
d) update FIELD_4 of MYTABLE.
e) update FIELD_5 of MYTABLE, COMMIT, and end transaction.
If something fails anywhere in between, I would ROLLBACK this single transaction.

That's not going to work either.  Your first update will create a back version that's just the difference between the old record state and the new state.  The second (or maybe third) will create a back version that's the whole record (IIRC) - much larger and possibly off page. Off page matters because it doubles the page writes.

Keeping my transactions shorter and more modular as above is easier from a development point of view but I have the back version performance issue.  Although the second method means a much longer transaction, I won't have back versions to deal with.  Do you think that this approach would be better?

No, just do all the computations in a single transaction and update the record once with all the changes. 


P.S.  Sweeping the database does not help with the performance problem, the only temporary solution to regain performance is to backup using GBAK and restore.

That's why I suspect there's more to it than just back versions and would like to see something about I/O, reads, writes, fetches, marks, etc.

Good luck,

Ann