Subject | Re: [IB-Architect] Backups of large database & super transactions |
---|---|
Author | Jim Starkey |
Post date | 2000-06-16T20:17:57Z |
At 11:36 AM 6/16/00 -0700, Jason Wharton wrote:
state what problem you're trying to solve it is impossible to judge
whether any particular scheme is a satisfactory solution.
I take it that you are unhappy both the time to back up a database
containing many large and largely stable images and the time it takes
to restore the database in face of a disaster.
Lets start with disaster. They come in natural (the disk crashes)
and not so natural (somebody goofs and deletes something very important).
The natural kind can be handled with a shadow set or a raid. The
not so natural sort requires running time backwards.
Since reversing polarity on the CPU clock, running time backwards
is ackward, so we're pretty much precluded to finding a time machine
and going forward until just before the goof. Now we get some
choices.
The original journalling mechanism produced a backup stream by
copying database pages to the journal then journalling incremental
changes. To minimize the size of the journal, index changes weren't
journalled. A full, proper recovery was to run the journal forward,
gbak the recovered database, then restore a clean copy. The assumptions
behind the design were that disasters were assumed to be infrequent and
that the eventual joy of recovery compensated for the relatively high
cost of recovery. Or, to put it another way, we trade efficiency
for normal operation for inefficiency of recovery. Another deficiency
is that it was broken many years ago and is not worth "recovery."
There are lots of variations on the theme of spinning off a shadow
and maintaining an incremental journal. The differences revolve
around the relative costs of journalling vs. recovery. Full page
changes are easy and robust, but volumulous. Incremental data
changes are minimial to expensive to restore. Add incremental index
changes and you get a happy medium -- volumulous journal and slow
recovery.
If we were to add page version numbers to the basic page layout,
we could do incremental shadow spinoffs, which would be the
cat's meow to a large, reasonably stable databases.
But unless and until we agree on the requirements for disaster
recovery, we can't evaluate possible alternatives.
So, Jason, in your application:
0. What type of disasters must you prepare against?
1. How often can you afford to perform a full backup?
2. After a disaster, how long can a recovery take?
3. Is it reasonable to assume a skilled DBA is available
during recovery to resolve problems?
4. How much additional I/O can you tolerate to drive a journal?
5. How much disk (measured in multiples of database size) are
you willing to dedicate to disaster recovery?
6. When recoverying from a journal, what information would a
DBA logically require to find the stopping point?
7. What questions have I missed?
Jim Starkey
>Jason, please start with the requirements. Unless and until you can
>> Like most hard problems, the place to start is to collect and
>> prioritize requirements. When we have a handle on the problem,
>> we can work on a solution.
>
>I know what the problem is and I am working on a solution. Just seems that
>people are more interested in shooting holes in my ideas that seeing how
>they solve real problems... I'm out in the real world solving business
>problems with InterBase on a daily basis... (not trying to imply anything
>other than that)
>
state what problem you're trying to solve it is impossible to judge
whether any particular scheme is a satisfactory solution.
I take it that you are unhappy both the time to back up a database
containing many large and largely stable images and the time it takes
to restore the database in face of a disaster.
Lets start with disaster. They come in natural (the disk crashes)
and not so natural (somebody goofs and deletes something very important).
The natural kind can be handled with a shadow set or a raid. The
not so natural sort requires running time backwards.
Since reversing polarity on the CPU clock, running time backwards
is ackward, so we're pretty much precluded to finding a time machine
and going forward until just before the goof. Now we get some
choices.
The original journalling mechanism produced a backup stream by
copying database pages to the journal then journalling incremental
changes. To minimize the size of the journal, index changes weren't
journalled. A full, proper recovery was to run the journal forward,
gbak the recovered database, then restore a clean copy. The assumptions
behind the design were that disasters were assumed to be infrequent and
that the eventual joy of recovery compensated for the relatively high
cost of recovery. Or, to put it another way, we trade efficiency
for normal operation for inefficiency of recovery. Another deficiency
is that it was broken many years ago and is not worth "recovery."
There are lots of variations on the theme of spinning off a shadow
and maintaining an incremental journal. The differences revolve
around the relative costs of journalling vs. recovery. Full page
changes are easy and robust, but volumulous. Incremental data
changes are minimial to expensive to restore. Add incremental index
changes and you get a happy medium -- volumulous journal and slow
recovery.
If we were to add page version numbers to the basic page layout,
we could do incremental shadow spinoffs, which would be the
cat's meow to a large, reasonably stable databases.
But unless and until we agree on the requirements for disaster
recovery, we can't evaluate possible alternatives.
So, Jason, in your application:
0. What type of disasters must you prepare against?
1. How often can you afford to perform a full backup?
2. After a disaster, how long can a recovery take?
3. Is it reasonable to assume a skilled DBA is available
during recovery to resolve problems?
4. How much additional I/O can you tolerate to drive a journal?
5. How much disk (measured in multiples of database size) are
you willing to dedicate to disaster recovery?
6. When recoverying from a journal, what information would a
DBA logically require to find the stopping point?
7. What questions have I missed?
Jim Starkey