Subject Re: Handling large imports while system is in production
Author rudibrazil78
--- In firebird-support@yahoogroups.com, Mark Rotteveel wrote:
>
> On Tue, 05 Feb 2013 13:21:01 -0000, "rudibrazil78"
> wrote:
> > Hi all
> >
> > We've been having to deal with a bad situation for quite a few years now
> -
> > how to handle large imports without taking the server down (in fact that
> > data is meant to be acessible as soon as its ready in production).
> >
> > This is a call center software which routinely imports lists of clients
> > and possible clients to contact.
>
> How often do you do this, and isn't it possible to do it in off-hours (eg
> at night, during lunchbreak)?

2 times a week, our client can stop their production for one hour, and thats long enough for them.


>
> > The import happens on 2 main tables, clients and calls, which are
> exactly
> > the same tables the operators will use/update while working.
>
> Are you only adding, or also updating? Do you do everything in a single
> transaction, or a transaction per individual change, or something in
> between (eg a commit for every x updates)?
>

checking if record exists and must be inserted (some conditions will stop a recording from being added, like "client already in another active campaign").

About 25 inserts / updates per line.

I do those in one single transaction, but if I use a single transaction to multiple lines, it handles the system ineporable (Im guessing locking the calls and client tables in a transaction will make the operators fall in a queue to update those).


> > These are very large tables (over 200 fields each), with many fk's (and
> a
> > lot of the fk indexes are reportedly bad ones).
> >
> > Not only importing is taking very long, but while importing is
> happening,
> > performance in production takes a severe hit.
>
> What is 'very long' and how many updates are we talking about?


calls table have 130 fields, one PK, 60 FK's, about 20 extra indexes.
clientes table have 186 fields, one PK, 45 FK's, about 20 extra indexes.
The other auxiliary tables (client_fones, custom_fields) are all small and operations on those are fast.

For testing sake, I droped all indexes and fks on both big tables and things got much faster. I might consider this as a last resort solution tho.
Im afraid If I constantly drop and recreate DB objects Im gonna have a bad day.


>
> > I know theres no magic wand, but I seriously need to start making a
> > checklist of practices I should be trying out to contour this situation.
> >
> > Thanks in advance...please let me know if I can clarify the issue
> further.
>
> You could consider adding a separate staging table where you first load
> the data into the database in that separate table, and only then (eg using
> a stored procedure, or using MERGE or UPDATE OR INSERT, etc) do you start
> loading it into the actual table(s). You might even massage the data before
> loading it into the final destination (eg remove records that don't need to
> be updated). This might not necessarily improve things, but I expect it
> will decrease time to load the data, and also reduce the performance hit
> for your operators.
>
> Mark
>

Im already using update or insert wherever applicable.
Not sure how using a secondary table and then moving to the real table would help too much, after all the data will end up having to be transfered to the actual table.

I guess Ill start looking in external tables and such?

btw thanks for the replies