firebird-support - Re: [ib-support] Re: Database grows rapidly (3 to 770MB) when using BLOB views/UDFs

Subject	Re: [ib-support] Re: Database grows rapidly (3 to 770MB) when using BLOB views/UDFs
Author	Claudio Valderrama C.
Post date	2002-03-24T02:44:51Z

""Ivan Prenosil"" <prenosil@...> wrote in message
news:20020322115313.560FE15002@......

>
> What I had in mind was situation when you create temporary blob
> (e.g. by selecting from view that contains UDF with blob as return
> parameter), and then you do not use(read) it at all.
> - Because such blob is not assigned to any row in table,
> it can't be garbage collected the usual way.
> - You also do not call isc_cancel_blob (either because of client crash,
> or because the driver you use does not allow such level of control,
> as mentioned in previous messages (odbc)).
> Will such blob be ever deleted ?

Isc_cancel_blob is the main way. Otherwise, consider a backup.
:-)
jrd/exe.c:release_blobs()
will try to help a bit. It's called by the looper() when a request has been
finished. It will call internally BLB_cancel and BLB_release_array. However,
IMHO, it can't kill blobs that you have still open from the client side.

When a blob is created, it's chained to the attachment, to the transaction
and to the linked list of blobs held by the transaction. Hence:
- the looper tries to get rid of blob that are in the same request
- when the txn is finished, more cleanup happens: the txn calls BLB_cancel()
on all temporary blobs.
- when the attachment is finished, the last cleanup is attempted (I didn't
check if there's something explicitly related to blobs other than notifying
other modules that an attachment is finished).

The simple creation of a blob is a totally volatile struct. As soon as you
push data on it, the story becomes more complex. A few bytes can be
contained in RAM, but as the blob grows, real db pages are allocated for it.

> > That's said, if you look at BLB_close, it doesn't call delete_blob. It
> > defers the page cleaning to normal garbage collection.
>
> I do not understand this sentence. I thought that "normal garbage
> collection" takes place when somebody "touches" the _record_ in _table_;
> garbage collection thread then removes obsolete record version(s),
> and obsolete index entries, _and_ obsolete blobs.

Ok, there's a special garbage collection routine in blb.c that handles
blobs,
but only blobs that are reachable from tables. Very small blobs simply
vanish if the server crashes because they are held in RAM only. For others,
the transaction informs the page manager and the cache manager that cleanup
is due. If the server crashes, I think a validation will mark those pages as
free (in case they were written before the crash).

> If you do not explicitly delete temporary blob by delete_blob,
> how/when it will be removed from database ?
> a) if I just call BLB_close;
> b) if I do not call enything

When the transaction finishes, basically.

As why you are able to read temporary blobs: this wasn't the original
design, but "daves" (Dave Schnepper, I suppose) did the change around v4 and
put a few validations to avoid the server being brought down with invalid
temporary blob ids, but I'm not sure those checks are enough. There's a
comment in the code. A second temporary blob is created to retrieve data
from the temporary blob that the UDF filled. That blob points to the same
pages than the original one. I could think on a wicked scenario, where you
create the blob with a direct API call, pass it to the UDF that will fill it
(instead of letting the engine to create it for the UDF) and also pass it to
another client thread that requests segments from the same blob. :-)

C.
--
Claudio Valderrama C. - http://www.cvalde.com - http://www.firebirdSql.org
Independent developer
Owner of the Interbase® WebRing