Subject Re: [firebird-support] Re: Truncating transaction logs in Firebird in v2.01 - v2.03
Author Helen Borrie
At 03:21 PM 14/04/2008, you wrote:
>Thanks for the responses!
>I guess in Firebird term it is called SWEEP or GARBAGE COLLECTION.
>GBAK does garbage collection by default, unless you specify "-g" option.

Well, sweeping and GC don't have anything to do with "transaction logs". They are both mechanisms for removing unwanted record versions and freeing up space on data pages so it can be re-used.

>So it is confusing to me when I backed up the database after manual SWEEPING and with Garbage Collection -- Restored the database from backup into new file -- SWEEPED new database again -- again backed it up with garbage collection -- and eventually re-restored again into new database file.
>After doing that the final size of the new file is dramatically reduced. The data and schema has not changed, however....

It is illusory and probably indicates that one of your restores didn't finish.

1. If you perform a gbak backup WITH garbage collection (the default) it does what sweep does, anyway. So sweeping as a precursor to routine backup is pointless.

2. If you perform a backup with the -g switch, the gbak process won't do any GC. Use the -g switch when you do a backup in exclusive mode with the intention of restoring it immediately without putting the database back online between the operations.

3. No form of garbage collection - either by sweeping, by backing up or by the routine GC that goes on all the time - will free up disk space and reduce the size of the database file.

4. Backup never saves old record versions. It reads the data from the latest committed version of each table as it stood at the start of the snapshot transaction in which the process runs.

5. Restoring a database will produce the smallest possible database file. This is because a restore completely recreates the database by creating a new file, writing the header, pumping the metadata and creating the objects and (almost) finally, pumping in the data. The last things to happen are the activation of the indexes and the recreation of the SQL privileges. The restore requests pages only as it needs them. No garbage is created by a restore.

6. Apart from the small variations in file size you might observe if your disk is very fragmented, you won't see any difference between the file sizes of two databases restored from the same backup. So if you have observed the file size "dramatically reduced" between two such operations, suspect that your restore in the "smaller" case did not finish; or else you ran that restore with the -i switch to disable the activation of indexes.

7. In essence, you should just stop worrying about database file growth if you are sure you are taking proper care of transactions in your application code AND you have allowed room for growth. Firebird isn't storing data in physical tabular structures at all. Database file growth reflects the number of *pages* that have been acquired since the last time you looked while the database was offline. Even though your most-used tables might be small enough to fit 20 records on a page, the engine will acquire a whole new page if it can't find enough recyclable space to accommodate a new record version.

If your page size is unnecessarily large, that could represent quite a lot of growth in a short period, especially if you have long-running read-write transactions that are inhibiting the routine GC.