Subject Re: [firebird-support] Database file size
Author Helen Borrie
At 07:06 PM 31/01/2004 +0000, you wrote:

> Thus, a record is 24 bytes, the fields for the PK comprise 20
>bytes, but the database averages some 103 bytes per record (after
>dropping the 650,000 bytes of the empty database and table).

In addition to what others have said, remember that Firebird doesn't store
"records" that have a tabular structure like Btrieve et al. It stores the
data for rows on database pages. It draws more space for pages from the
filesystem as required. A page is (by default) 4 Kb (different if they
reconfigured the page size).

There can be data for multiple rows stored on the same page, each one
stored in its own piece of physical address space. However, the engine (by
default) fills pages to approximately 80%. The next incoming row after
that will be one row on a new page.

Unlike your Btrieve database, you don't have a file with the first record
physically locate at the beginning, beneath a header, and the last at the
end. Pages for a particular table can be anywhere at all - there is no
requirement for them to be contiguous on the disk and very often they're
not. In a multi-file database, pages for the same table might be
distributed across the files.

Once a page has been allocated from the filesystem, it never gets put
back. Space occupied by deleted rows and old versions of rows will get
freed eventually, as a result of garbage collection and sweeping. Once
that space is freed, it is available to take new rows and record versions
for the same table. Over time, if the actual volume of data settles down
to a more or less stable level and good housekeeping practices are
followed, you'll stop seeing significant file growth.

Pages are used to store everything. Indexes are stored on index
pages. Various accounting data records and inventory records maintained by
the DB engine are stored on their own pages. Blobs are stored on their own
pages. And so on.

Restoring from a gbak backup completely reconstructs the database and
rebuilds and rebalances the indexes. If you do a b & r after an intense
period of DML activity, you'll see it will be a lot smaller because the
data gets laid down into new pages that are filled consecutively.

/hb