Subject Re: [IB-Architect] Backups of large database & super transactions
Author Jason Wharton
Jim,

> >I'm not talking about maintaining an external "log" of any kind. I'm
> >proposing that the existing versioning engine be modified such that the
> >garbage collection and transaction management system will efficiently
> >protect a "frozen" transaction and make it so that the changes since that
> >time can be discernable. I see this as a natural extension/enhancement of
> >the versioning engine we already enjoy. The "log" is actually woven
through
> >the version tree of records. We would just need a way to extract and
> >re-apply them.
> >
>
> You're essentially talking about stopping garbage collection.

No, I'm talking about enhancing it. Again, I think you are having some bias
problems here.

There is no reason that a certain amount of garbage collection couldn't be
going on within a super-transaction after a freeze point has been
established. I'm not saying that every normal transaction since the earliest
freeze point has to be saved. I agree that the granularity of an individual
transaction is a "holy grail" here.

> Record chains get longer and longer,

Increased some, yes... You would only have a potential of one additional
record in the record chain for every frozen super transaction. And that is
only for those records that an update or a delete occurred on. Garbage
collection would certainly remove everything before the first ST and it can
clear out intermediate record versions between the all ST's and the last ST
and OAT. In essence, everything prior to the OAT gets "merged" into a single
consolidated super transaction.

> data pages get fluffier and fluffier,
> the amount of space filled with near useless crud goes up,
> indexes fill up with useless values pointing to unreachable records,
> index retrievals waste resources chasing deleted records,
> disk usage goes, disk IO does up, memory usage go use,

This all is already a maintenance issue. For the reasons above I don't think
this would become critically significant.

Don't forget that the only situation you would really ever do this in is
when the ratio of the size of your database to what is changing on a daily
basis is incredibly small. Most of the systems that will be employing such a
feature are the kind that do inserts and queries with very little updating
going on and possibly no deleting at all.

To scale back some of the impacts of this you could design it so that
indexes would not keep anything prior to the OAT. Is all that would mandate
is that any query mechanism used to read information from a frozen super
transaction would not be able to take advantage of indexes at all. I bet it
safe to assume that GBAK doesn't use indexes to process the data.

> the amount of processing
> required to find a record goes up.

If the indexes are free of ST data then I don't think this is any
significant factor.

> That's an extremely expensive
> way to take a snapshot.

That's not the way I have envisioned it...

> The resources wasted wallowing in a database
> with internals like a landfill dwarf the cost of spinning off a clone.

Not if it is done correctly.

> Sure, it could be done and it would be easy. As soon as the source
> comes out, we'll make you a special version with garbage collection
> disabled and you can try for a week or two. The bad part won't be
> that your application slowly grounds to halt but the inevitable
> sweep that happens when garbage collection is re-enabled. It will
> finish sometime in the next millenium if you can keep your machine
> alive that long.

Couldn't resist entertaining us some more could you...

> Database performance is mostly an exercise in maintaining density
> and locality while avoiding tying yourself in knots. Turning off
> internal maintenance is not that good of an idea.

Agreed. I didn't suggest that we do that. I suggested it be enhanced.

> >I agree that things like Index(s) should be left out. It should only be
> >actual data that gets streamed out. I also think that any metadata
changes
> >should "break" the ability to walk from one freeze point to another. If
> >someone needs to alter their table structures then the database should be
> >considered incompatible with the other structure. Thus, a DBA would
> >establish a new base after performing any structural changes. It is
already
> >my rule of thumb to do a total backup and restore after altering a
database
> >structure...
> >
>
> Now I am confused. You're not keeping a log of any kind, you've got
> a special backup in bed with the engine, and something is streaming
> data.

Yes, you are confused. Hang in there and perhaps you'll get it soon enough.

> A backup can never hurt you, but you're wasting your time with the
> restore. There is no loss of performance over a meta-data change.
> The system just keeps track of what's what. We designed it that
> way to encourage people to keep their databases in sync with the
> their understanding of their problem. Anyway, any useful mechanism
> has to survive ordinary day to day operations.

Fine, it would be nice not to have that limitation. Let's do it too.

> >> 1. How often can you afford to perform a full backup?
> >
> >No more than once a week, I easily imagine others at more of a monthly
> >schedule.
>
> Does that mean you can tolerate a week's loss of data? If so, I don't
> think your applications typical. Most folks get a bit testy when they
> lose a week's worth of work.

You are missing the point. You said a FULL backup. I want to get to where I
can do regular backups without having to do a full backup so that I minimize
the opportunity of losing quantities of data.

> >> 2. After a disaster, how long can a recovery take?
> >
> >Isn't the goal of a commercial product to be as short as possible?
> >
>
> Nope. Everything is a tradeoff. Recovery time is always traded against
> normal update time. If recovery time were the only consideration, things
> would be a lot easier.

With the solution I propose you might miss a few hours of data but your
recovery opportunity would be very quick if you had the restored "freeze
point" database "on-deck" ready and waiting for the last delta to be applied
and sent on-line. How long would it take to apply changes from a few days
work to a database frozen a week ago? Probably not long. If so then you
probably aren't going to be considering InterBase as your database in the
first place.

> >> 3. Is it reasonable to assume a skilled DBA is available
> >> during recovery to resolve problems?
> >
> >If there isn't one then they probably don't have a database over 500MB.
Who
> >cares?
> >
>
> I do. My goal is to reduce the expertise required to care and feed for
> a database, not provide for full employment for DBAs. If you want
> performance in Oracle, learn placement control. If you want both
> performance and a life, use InterBase.

We imply the same thing. We don't really want to care if someone has a DBA
or not. InterBase should be usable out of the box without one. That isn't to
say that we want to discourage people from having knowledgeable people work
with the product as a DBA. It also shouldn't mean that we aren't going to
give a DBA some tools to meet their needs.

What we are talking about here won't matter to someone if they aren't using
it. It would be an optional, transparent, capability of the engine accessed
through a very simple interface.

> >> 6. When recoverying from a journal, what information would a
> >> DBA logically require to find the stopping point?
> >
> >Again, your biases are leaking in here.
>
> Sorry about that. If the problem was a goof, the roll forward has to
> stop before the goof. The problem is that you generally don't know
> exactly who did what. Unless there is some clue as to when a transaction
> started or what it did, the problem is unsolvable.

I'm not looking for that type of granularity. That's a "holy grail" issue.
I'm talking a super-transaction. A conglomerate unit of work that
represents, at an administrative level, changes in a database over a period
of time extending beyond typical user transactions.

> (Hint: there are
> 2 bits per transaction and four states. How many bits are left over
> to record start time?).

Depends on how many bits you have total to work with. Bump it to 32 and
we're covered. It's not the 80's anymore. I don't think what I am requesting
needs a start time. It will likely need some other things though...

Regards,
Jason Wharton
CPS - Mesa AZ
http://www.ibobjects.com