firebird-architect - Re: [IB-Architect] Shadows questions...

Subject	Re: [IB-Architect] Shadows questions...
Author	Olivier Mascia
Post date	2000-06-23T18:54:45Z

From: "Leyne, Sean" <InterbaseArchitecture@...>

| So your answer is 46 hours in comparison to 52 hours, right? <grin>
|
| How big is your db? (and can't remember off hand from your previous
| postings...)
|
| Well, I don't know about you but, that doesn't seem to be much of an
| increase.
|
|
| Which makes me questions whether the current "create a backup using the
| shadow process" discussion has any real benefit, within the context of
| enabling a new HIGH-SPEED database backup process.
|
| Sean

Let's not draw conclusions too fast, Sean.

As source code is not yet available, few people can really comment for now
on how the shadowing creation process is exactly designed. Obviously I'm not
among those people. The creation process may have been designed to impact as
few as possible the connected users (in terms of performance). In a new
backup system, even if build around the principle of shadows, but not
necessarily using the exact code of shadows subsystem, the performance could
be balanced. I mean it looks perfectly acceptable to impose a reasonable
impact on performance for users while such a backup is in progress. After
all we want that backup, and not simply a shadow.

It all depends on the the user load, of course, but initializing a shadow
file (a real one or one of a new backup scheme based on it) when the
database is quiet, should be nearly as quick as an OS COPY, if properly
designed and coded. If the performance decrease, it is because of external
causes : machine loaded by other processing (not necessarily IB) or by an
important user charge on the DB/engine or I/O limitations factors.

I think Dalton also has gbak processing running (multiple of them) at the
same time. If true, it has an impact on the performance. As far as I
remember, Dalton's databases are measured in many GB. Well it anyway takes
time to 'copy' such a volume.

On the same line, let's not forget the 'between-the-lines' idea floating
around : ability to incrementally re-synch a shadow with its DB after the
shadow had been 'detached' from the DB for some time. There is still an
important amount of research to do to see if it is possible to design such a
after-detach re-synch. And we may go nowhere, yes. But if bright
implementation idea can be found, it may be very effective to re-synch such
a 'backup-shadow'.

The scheme then becomes :

1. Create a 'backup-shadow' : pretty much the same as todays shadows, maybe
optimized in some way to get it ready quicker than today. (Yet to prove than
today method is not 'quick'...). At the end of the shadow creation process,
the shadow is detached from the DB. It is a snapshot.

2. Backup your detached shadow any way you like to any media.

3. When you want to take a new backup, you've got a choice. Start back at 1.
or use the 'incremental' procedure. This one brings back your shadow
in-synch (yet another algorithm to be designed) and detaches it as soon as
done so you can again tape backup it or do whatever you like with it to make
it safe. This step *must* be incremental, to be *way* quicker than a full
new
shadow rebuild from scratch.

This means you could take full tape snapshots more often.
Disaster recovery : replace the damaged DB with the most recent
detached-shadow instance you have. You can't recover quicker than by an OS
copy.

** Though, I am not at all satisfied with this until now. **

Main reason : what happens between the time the shadow is detached and bring
back in-synch later for the next backup phase ? You have no protection
against failures between two snapshots. Just like gbak. You can revert to
the last snapshot, and you have a mean to get such complete snapshots more
often than with current techniques, that's fine (if technical feasible). But
there still is a gap of time where you run without any protection. You may
loose the complete work done since the last snapshot.

I would appreciate if we could design a complimentary scheme to protect as
much as possible the data between any two snapshots. This sounds like a
request for some transaction logging. But I would favor any idea based on an
on-line page shadowing of only the pages touched since the last
'shadow-backup' time. This partial shadow could be applied back to the last
snap-shot done. You would bring back your DB to life quite quickly up to a
very close point to failure instant. Some timestamp mechanism could allow
you to limit this if needed for some reason ('restore to the latest stable
state, but not further than this morning 8 am'). I understand what I have in
mind may sound unclear and uncomplete for now. We all need time to learn,
learn, learn the existing code and techniques and later formalize this.

Olivier Mascia
om@...