Subject Re: [IBO] Backups hanging
Author Steve Harp
--- In, Helen Borrie <helebor@...> wrote:
> At 03:30 AM 9/03/2006, you wrote:
> >Hi All,
> >
> >I'm having a recurring issue where the Firebird (v1.5.2.4731) backup
> >is hanging up. I'm using a TIBOBackupService component to backup
> >database_name.fdb and create a database_name.fbk. On rare occasions,
> >the backup is failing and leaving the fbk open. This open file causes
> >future attempts at a backup to also fail. The only solution we've
> >found is to stop/start the service.
> >
> >Is there something I've missed in my implementation? Has anyone else
> >seen this and is there a solution?
> You say the backup "fails", yet your subject says "Backups
> hanging". Which is it?
> If the backup is really failing, you should get an exception and gbak
> will close the file properly. If it's being caused because you're
> backing up to a share, then you'll need to find out what's happening
> to cause the network breakdown.
> If a backup seems to "hang" it usually means there is a lot of
> garbage that gbak has to collect along the way. GBak runs in a
> single concurrency transaction and you can't just stop it because it
> happens to be slow. The backup file stays open until the the Service
> Manager closes it. And it won't close it until the process either
> finishes the backup or aborts it because of errors.
> So it looks like one thing leading to another.
> Some things I would do here, if you're periodically getting these
> heavy GC ops occurring during backup and it's enough to be causing a
> 1. Add a button to enable the user to run the backup without GC. On
> these occasions, you could follow up with a routine to restore the
> database under a different filename, connect to it and run a simple
> query. If it returns something, you know the backup is OK. Then you
> can return a message dialog to the user recommending that the
> database (not the server!!) be shut down; if it is accepted,
> disconnect the app from both databases; then run a batch file to
> first rename the old db file, then rename the restored file, then zip
> up the old file.
> 2. You could do something smart like run statistics first and check
> the difference between the Oldest Interesting and the Next
> Transaction. If it's higher than a few hundred, send a dialog to the
> user warning that the backup may take a long time and recommend that
> the database (not the server!) be shut down. If it is accepted, work
> out some way to get all the users off and defer the backup so you can
> do the housekeeping, run the backup without GC and overwrite it (as
in 1)
> 3. Another thing is not to make repeated backups using the same file
> name for the backup file. I always generate a filename based on
> current server date and time, e.g. MyDb.603090905.gbk if I was
> running it right now. (That's year 6 month 03 time day 09 time
> 0905). That way, if you get an orphan backup file, you're not going
> to be prevented from repeating the backup if a user wrecks the backup.
> 4. Since you're getting this big buildup of garbage, you should be
> running sweeps in shutdown mode as part of your regular housekeeping.
> 5. For the long haul, you'll need to find out why you're getting
> such a buildup of garbage and try to fix the problem in your app.
Thanks Helen. I'm not sure this issue has to do with garbage
collection. There's no network issue to contend with. The Firebird
service is running locally and it's basically a single user
application (although multiple instances of the application may be
running). The databases in question are very small (2 - 10 MB) and
backups and sweeps are performed frequently. We're certain that this
issue happens on Win NT4 machines but we're not certain whether or not
other OSs have had the problem.

What's happening is that during a backup something causes the backup
file (fbk) to be left open. Then on future backups, the backup fails
because it can't overwrite the fbk.