Subject Re: [firebird-support] can you "compact" a database ?
Author Helen Borrie
At 09:15 AM 29/12/2007, you wrote:

>I wonder if it's possible to "compact" a database, in other words to
>delete permanently the deleted data.

The notion of "compacting" a database belongs to xBase and other file-based database architectures, where tables are discrete physical objects. In these models, deleted data are not deleted, only disabled.

In Firebird, when data is deleted, it becomes obsolete and remains as a "stub" until it is no longer "interesting" to any extant transactions. Then it is garbage-collected and the space it used is released for re-use.

>I have a database ~50MB. After deleting all the entries in most tables,
>the database file size is practically the same, while it should be 2MB
>at most. This means data is just marked as deleted, but in fact
>remains - I'm looking for a way to remove it really.

It really has been removed. The database will re-use the space eventually. As long as you observe good practice in managing transactions, the database growth will settle eventually and will be more reflective of the actual growth of data.

A Firebird database uses a multi-generational technique to manage concurrency and data consistency. Data of many kinds - records for tables, index trees, blobs, inventory pages and many more - are stored on blocks of disk known as "pages". There are no "files" inside the database like in your xBase dbs.

If a single data page has enough space to store more than one record then multiple records on a single page will always belong to the same table. When the engine wants to store a new record version, it seeks available space on pages that already contain record versions for that table. If it can't find enough space, it asks the operating system's file system for more.

At any moment in the life of an active database, there are likely to be multiple versions of records in the database. One version of each record is the "latest committed version" which is what all transactions see when they start. Older versions may still be there, either because there are are transactions still interested in them or because they are waiting for garbage collection. Newer versions exist wherever there is a transaction that has posted an update that is not committed yet.

You can totally "refresh" the physical layout of a database by backing it up (using gbak) and restoring it from the backup file. The restored database is "brand new", with data pages filled to ~80% where possible and no redundant data pages waiting for re-use.

./heLen