Subject Re: [IB-Architect] v6.0 GBAK and garbage collection
Author Charlie Caro
Leyne, Sean wrote:
>
> With the introduction of the background garbage collection thread in
> v6.0, would GBAK still perform the cleanup itself or does it simply flag
> the page to be cleaned up by the background thread (like a normal I/O
> operation would)?
>
> If it does, wouldn't the backup actually run fast if it didn't and left
> it to the background task?
>
> Sean
>

Sean,

GBAK still performs its own garbage collection for the very reason that
Jason Chapman mentioned earlier. Many customers backup a database for
the performance side effect that it garbage collects useless record
versions and they sometimes do it in offline batch mode.

The garbage collector thread was designed to exit when the last database
attachment is released. So a backup could finish in 1 hour and queue 3
hours of garbage collection for that thread. When GBAK detached, it
would cause the garbage collector to exit and leave all the garbage
record versions in the database.

I had considered an alternative strategy which still has some merit.
Make GBAK kickoff the database sweep thread inside the server. Unlike
the garbage collector thread, the sweep thread must run to completion
because its purpose is to update the OIT (oldest interesting
transaction) as well as to garbage collect. So the benefits would be:

1) GBAK runs fast with no garbage collection;
2) Sweep garbage collects in parallel even after GBAK exits;
3) Sweep updates the OIT for free;
4) GBAK's concurrency mode transaction is shorter preventing fewer
versions from being maintained during online backup;
5) Sweep's transaction (new V6 behavior) can run forever if it has to
without inhibiting any garbage collection;

It's too late to adopt this alternative behavior for V6. If the
community thinks it is a good idea, it can be changed in the future.

Regards,
Charlie