Subject Re: Mass Deletes and Backup/Restore questions to make a database smaller in size
Author Adam
--- In firebird-support@yahoogroups.com, "SoftTech" <miket@...> wrote:
>
> Greetings All,
>
> Firebird 1.5.3
> Delphi 5.1
>
> My goal is to do house keeping on all of the history tables in our
database and keep the last 30 days on some tables and the last 30
activities on others. Then do a backup and restore to make the size
of the database smaller.
>
> Please keep in mind that I did the following on a backup of the
production database so that I could test my work and not affect the
production database until I'm sure this will work.
>
> One of the tables in our database (DIALER_RESULTS) stores results
from a predictive dialer. The table currently has over 5-1/2 million
records in it and we have decided to do some house keeping on this
table by keeping the last 30 activity results for each account and
deleting the rest of the records.
>
> I have created a program that selects all records from the
DIALER_RESULTS_ACCT table into a TClientDataSet (This table has just
one field ACCT_ID) and then loop through each record in this dataset
which then calls a stored procedure that selects all records for this
account in descending order and deletes any record past the first 30
records.
>
> I ran this program on a weekend with no one else around (as I ran it
on the same server used for the production database) and it
successfully did the job in several hours returning that over 3-1/2
million records were deleted.
>
> Q: Should I have done a select statement against this database to
force garbage collection?

Not if you intend to do a backup/restore cycle immediately afterwards.

>
> Q: Is it a good thing to delete this many records at one time from
one table?

It is neither good nor bad. It comes down to your business
requirements. Keep in mind that Firebird can not truly delete the
records straight away in case some concurrent snapshot transaction
needs that data. Instead it flags each record as deleted. Eventually,
the record can be physically removed in a process called garbage
collection. This garbage collection may be costly (especially on 1.5
or earlier with certain indices), so you must take this into
consideration or you will end up impacting the performance of your users.

>
> Q: Any gotchas when it comes to doing a backup and restore? The
backup normally takes a few hours to run on a 6-1/2 GB database. Will
the backup take longer now because of the mass delete?

* You will need exclusive access to the database during this cycle,
(or at least make sure no-one else can write to the database, because
any transactions committed after the backup starts will not be
included in the backup file). I physically rename the database file.

* If you intend to do a backup-restore, use the -g switch in gbak.
There is no point waiting for garbage collection if you intend on
deleting the original file anyway. If you have deleted a bunch of
records, the backup file will be smaller and the restore will be quicker.

* Restore the database to a different name. If for whatever reason the
restore fails, you don't want to have overwritten the original
database file.

* Don't allow anyone to connect until the restore has completed.

>
> Q: Any guidelines available on the best ways to do a mass delete and
what all should be done to make the size of the database smaller?

Forget file size. Firebird will re-use disk space if you manage your
transactions. If you have running transactions for hours/days on end,
garbage collection won't be able to keep up which will cause you
problems, but if you run a cron job every night to purge records older
than X, I would expect your database size to stabilise and grow only
in proportion to the growth in records.

Adam