Subject | Re: [ib-support] Importing data into Firebird |
---|---|
Author | Helen Borrie |
Post date | 2003-01-29T06:54:24Z |
At 05:32 AM 29/01/2003 +0000, you wrote:
time to time and commits records in batches of about 30,000; also so that
it logs things like exceptions. Alternatively - or in combination with
your breaks and logging - you can use events. All in all, to get feedback
from server to client you have to provide the means to feed something back...
Much more info would be needed about your environment before it would be
useful to start suggesting how you achieve these things.
of several good reasons to have the selection of source data under good
control with the aforementioned mechanisms.
exceptions to handle problems - it is up to you whether you "eat" an
exception by simply bypassing it, or by logging it, or have it terminate
the procedure by breaking out of it (the as-designed behaviour of the
inbuilt exception handling). In the latter case, again, it is up to you
whether you deal with the problem data from the client before resuming; or
just bypass the problem, commit what has succeeded so far, and
resume. (Resuming assumes your procedure is passing back, in output
parameters, suitable key values on breaks to provide inputs for resuming
where you left off before.
data. However, you will have an unresolved transaction to deal with when
you restart and, since you don't know what the conditions were when you
stopped the server, any unresolved updates will be irretrievable. And you
will have the task at some time of running gfix or a backup and restore to
remove the lost transaction and its uncommitted record versions.
In short, stopping the server isn't a friendly way to control a
long-running query. On a Windows server, it has a habit of being risky,
especially if you have ill-advisedly set synchronised disk writes off. A
well-controlled process is much to be preferred.
common was that they had nothing in common - a slight exaggeration, I admit
- but certainly every migration task has its own peculiarities. I always
like migrations where it's possible to get ALL of the legacy data into
external tables as type char(). From that point, they go into transitional
tables in a Firebird database, where I don't touch the imported data at
all, but simply add the keys I want to use for batching and
converting. These keys are generally just unique row identifiers for each
table. I *never* do any of the "final" stuff, like constraints or
indexing, on transitional tables.
At that point, I'll perform a backup and restore on the transitional
database and copy the restored version transitional database to a CD. That
gives me a clean, complete reference to the legacy database's data, which I
can use and reuse throughout the migration cycle, first to extract test
data and, when I'm ready, to be the source for the real data pump. It also
means that I can work on the real database schema at a very abstract level
and be alerted to many of the potential conversion problems well before I
begin the actual pumping.
heLen
>Hi all,That's a good start. :-)
>
>I have the following problem:
>- We are in progress of migrating large database tables - millions
>of records per table - from an old ISAM based DBMS
>(CTOS OS - anybody else still using it? ;-) ) to Firebird 1.0.
>The raw data was successfully imported into Firebird using
>external tables.
>This is OK.Yes, if you design your process so that it breaks back to the client from
>Now we have to perform extensive data manipulations (restructurings,
>normalize-denormalize, conversions, translations ...), because the
>new database will have a completely different structure than
>the old one.
>
>The questions:
>- Is it possible somehow to keep track of progress of long-running
>update batches and stored procedures from ISQL or other
>DB admin tool? I mean, to display the number of processed records.
time to time and commits records in batches of about 30,000; also so that
it logs things like exceptions. Alternatively - or in combination with
your breaks and logging - you can use events. All in all, to get feedback
from server to client you have to provide the means to feed something back...
Much more info would be needed about your environment before it would be
useful to start suggesting how you achieve these things.
>Currently I am able to watch only the hour-glass cursor on myWhat do you mean by "batches"? You cannot stop an SQL query...hence, one
>screen. Some of the batches take hours, and I have no insight into
>their progress.
>- Is it possible to cancel the long-running batches?
of several good reasons to have the selection of source data under good
control with the aforementioned mechanisms.
>(if an update batch does not finish in a reasonably long timeYou need to intercept the problem immediately it happens. You can use
>interval, then it's likely that something went wrong ...).
exceptions to handle problems - it is up to you whether you "eat" an
exception by simply bypassing it, or by logging it, or have it terminate
the procedure by breaking out of it (the as-designed behaviour of the
inbuilt exception handling). In the latter case, again, it is up to you
whether you deal with the problem data from the client before resuming; or
just bypass the problem, commit what has succeeded so far, and
resume. (Resuming assumes your procedure is passing back, in output
parameters, suitable key values on breaks to provide inputs for resuming
where you left off before.
>If I stop the Firebird server while there is a running updateIn a best-case scenario, stopping the server doesn't corrupt
>process, then I suppose that it could lead to data corruption.
data. However, you will have an unresolved transaction to deal with when
you restart and, since you don't know what the conditions were when you
stopped the server, any unresolved updates will be irretrievable. And you
will have the task at some time of running gfix or a backup and restore to
remove the lost transaction and its uncommitted record versions.
In short, stopping the server isn't a friendly way to control a
long-running query. On a Windows server, it has a habit of being risky,
especially if you have ill-advisedly set synchronised disk writes off. A
well-controlled process is much to be preferred.
>- Is there a well-tried practice to carry out such data migrationI've done lots of data migrations and the single thing they all had in
>tasks?
common was that they had nothing in common - a slight exaggeration, I admit
- but certainly every migration task has its own peculiarities. I always
like migrations where it's possible to get ALL of the legacy data into
external tables as type char(). From that point, they go into transitional
tables in a Firebird database, where I don't touch the imported data at
all, but simply add the keys I want to use for batching and
converting. These keys are generally just unique row identifiers for each
table. I *never* do any of the "final" stuff, like constraints or
indexing, on transitional tables.
At that point, I'll perform a backup and restore on the transitional
database and copy the restored version transitional database to a CD. That
gives me a clean, complete reference to the legacy database's data, which I
can use and reuse throughout the migration cycle, first to extract test
data and, when I'm ready, to be the source for the real data pump. It also
means that I can work on the real database schema at a very abstract level
and be alerted to many of the potential conversion problems well before I
begin the actual pumping.
heLen