Subject blobs and multiple updates
Author jbgilmartin
I have an application that is currently in MS-Access that I am
considering switching to FB.

One particular application need is to store a process log file for a
set of objects. The objects are stored as single rows in a table
which have a blob field for the variable length and width log entries.

It is usual to have a set of rows/objects that are created and then
later another application acts on those objects generating variable
length log line entries (each entry could be from 25 to 150
characters in length) and varying between 10 and 30 lines long.
Instead of potentially wasting 125 bytes of space for each log line
entry if I stored each entry as a separate record in a log table
(plus 4 bytes of space for the key field for each line), I wanted to
use a blob field to store these.

Since the each log entry line comes in piecemeal, I am updating the
blob fields for those objects for each log line. For example if there
are 2 objects being worked on, object 1 will update its blob field
and maybe object 2 will update its blob field twice before object 1
needs to update its blob field again for a new log entry.

THE PROBLEM: with access, each update to a blob field that is larger
than the previous blob size (always the case in this application)
means the blob segment of the data page gets uprooted and moved to
another page. Since the application suite works on say 300 objects
over the course of 8 hours, the result is that the access database
gets huge. This is remedied by compacting the database periodically,
but I have run across problems with the application crashing after
about 8 hours of work. I cannot cache the log entries and update the
blob field just once as they have to be available in real time to
other applications via the database.

I believe Firebird uses similar mechanisms when working with blobs.
Is there any way around this? Any other ideas that I may have blindly
missed? One idea was possibly overallocating a blob field say at 8K
and each blob update would maintain the size of the blob field so
that the blob field would not get moved each time there is an update.
I generally know when the final log entry is written and I could
truncate the field then to the actual size. Anything else I'm missing?

Thanks
-- Jim Gilmartin