Subject RE: [IB-Architect] GBAK processing
Author Leyne, Sean
Markus,

In general, I agree with you regarding the most painful part being the
restore.

In the vast majority of cases (98% or more), however, it is the backup
process which is run and, therefore, most likely to affect user access
and system performance. In some cases, the length of time for a backup
impact of the DBA ability to run the backup (in cases where multiple
backups are scheduled/run each and every day).

My initial concern is with improving the performance of the most common
operation. By definition, a restore is an exceptional situation.

I do, however, very much appreciate the fact that to restore a multi-gb
DB does present significant issues.

I don't understand your reference to a "virtual backup", please expand.
How would it work for multi-file DB's?


Sean

-----Original Message-----
From: Markus Kemper [mailto:mkemper@...]
Sent: Wednesday, April 26, 2000 2:05 PM
To: IB-Architect@egroups.com
Subject: Re: [IB-Architect] GBAK processing


Sean,

Not looking at the code, I believe that it is single
threaded. From what I've seen with customers, the
most painful part of GBAK is time to recovery from
a failure/corruption. Time to recovery being either:

Restore from a Backup

(gbak -c ...)

Backup (corrupted db) and Restore

(gbak -b ...)
(gbak -c ...)

Any way you slice it this process gets more painful
as the database grows thus, I suspect that a different
strategy will need to be considered for really large
databases.

We've bounced around a couple ideas as to how we might
speed up gbak as it is today.

1) Add functionality to gbak to create a virtual
'backup' file. You can do this with UNIX/Linux
today outside of the gbak program. You might be
able to do it on Windows too if you have something
like MKSToolkit but, I have not tried it.

This syntax is from memory not testing today, so
please forgive me if I am incorrect but, the idea
is to write the backup file to a pipe and read
from it in the background thus, performing the
two step operation in one.

mknod gpipe
gbak -c gpipe new_database.gdb &
gbak -b old_database.gdb gpipe

2) Creating indexes is a large part of the expense
in restoring a database. Currently a full table
scan is required for each index even if they are
on the same table. Meaning that

create table foo (
f1 integer,
f2 integer )
create index idx_1 on foo ( f1 )
create index idx_2 on foo ( f2 )

would cause two full table scans to build the
indexes during a restore. The idea was to modify
gbak to build multiple indexes at a time, thus
only having to do one full scan per table when
building database indexes.

Markus




> Is GBAK (the backup process) a single threaded function, performing 1
> task at a time, or is it multi-threaded?
>
> Since I think the answer is that it is single threaded, I'll ask a
> couple of other questions.
>
> Would a multi-threaded backup process perform faster then the current
> model?
>
> What issue would prevent (or made very difficult) GBAK being made
> multi-threaded?
>
> I been thinking about a different approach to GBAK where it creates
> treads (up to a certain limit) for each DB object to be backed up,
with
> these threads reading the DB information and building "backup
> information" to be written out by a single "writer thread". I
> understand that the system structure would need to be processed by a
> single thread and written to disk before any application data.
>
> Sean
>