Subject Re: [firebird-support] Re: nbackup / gbak interoperation
Author Ann W. Harrison
Doug Chamberlin wrote:
>>
> Yes, lots of clear thinking and planning and extra resources can prevent
> the problem from arising. That does not help the poor DBA who finds
> himself in a pickle from which this capability would rescue him. That is
> the situation I'm addressing. Telling people they should have done
> something differently just does not address the problem.

Actually, there's an easy fix if you find yourself running out of disk -
buy another disk and add a file on that disk to the database. Buy a
USB disk and you don't need to take the system down. It is frankly
unlikely that you'll get enough space back from a backup/restore
to save yourself. If you've done lots of deletes or updates and have
lots of free space, then Firebird is going to use that space before
it extends the file again. If Firebird is extending the file, then
you don't have (much) free internal space to recover.

I've explained this several times, but let me do it one more time. It's
very hard to release space from a Firebird database without rebuilding
it. If you've got a database from which you just deleted 1/3 of the
records, you'll have free space scattered through the file. If you're
really lucky, all the records you delete will be on the same pages so
there will be 1/3 empty data pages. In real life, you'll probably have
1/10 empty pages and the rest partially full pages. In either case,
the empty pages are unlikely to be contiguous, let alone contiguous and
at the end of the file. You can't release pages in the middle of a
file.

So lets assume that you're really lucky - or that your database design
is such that you store a lot of stuff one day and delete all of it the
next day... A FIFO database. Even so, you're not going to find a bunch
of empty and released data pages alone at the end of file. Mixed in
with them will be transaction inventory pages, maybe a page inventory
page, pointer pages and index pages, none of which you can get rid of.
Those pages would need to be moved back, and all references to them
would need to be fixed. Relocating an index page is a major headache,
fixing three pointers without deadlocking or leaving dangling pointers
in the middle of the operation. Just splitting index pages is hard,
and when you're splitting at least you know where you are. Besides,
the next day you'd just reallocate the same number of pages to store
that day's data.

Now, perhaps you're thinking of an incremental logical backup in the
server that intercepts changes during the backup/restore cycle
and applies them once the restore was done. If primary keys were
required, that might be possible, though not at all easy... since
queries would need to look first at the database, then at the saved
changes to be sure they got the right version of data. Without
primary keys, there would be no way to know what record should be
changed. Record numbers (db-keys) are not stable over backup/restore.

Besides, when all is said and done, space will be reused. In the
very early days - InterBase V1.5, I think - we discovered that leaving
free space on each page proportional to the size of the records
improves performance on a read/write test. So a little fluff is
a good thing.

Basic fact, if you're coming from Postgres and you therefore know
all about the need for vacuuming and compaction, think again. Postgres
was designed for optical media and originally had no provision for
space reclamation whatsoever. That was a long time ago and they've
moved on, but the design does not lend itself well to space reuse.
InterBase was designed for magnetic disks of microscopic size by
modern standards it and its descendants make pretty good use of space.

If anybody has a gstat that shows a table with more than 500 records
where the records are less than 1/3 of the size of a page that has
less than a 50% fill rate in normal use - not after a huge delete or
delayed garbage collection - I'll be very surprised.

Cheers,

Ann