Subject Re: Best Practice Question - How many rows to update before commit?
Author Myles Wakeham
Thanks to everyone for their advice & input. The setup for this is using an
external program with an ODBC interface to FB1.5 (I think it's the ODBC
Driver that I downloaded from IBPhoenix - the open source one).

The program I'm using for inserting is simply a loop that reads each source
data row, and then does a single INSERT into Firebird for the data. There
are 12 different tables being updated. Of these, 8 of them have foreign key
relationships with other tables in the database, and some additional indexes
for users to get faster query results afterwards. I am not dropping indexes
before I do the inserts, but that is a very good idea and something I can
add in to see if we get better results.

My experience has been that if the database is on a networked server over a
LAN I get a 3x performance increase vs. having the database server running
locally. This suggests to me that the ODBC driver is playing a big part in
reducing performance, although I haven't tested this with different drivers

One other thing that I have noticed is that after about 8-9 of the tables
being processed, I periodically get a disconnection to the database, losing
the transaction and cursor and halting further INSERTS (each one fails after
this time period). I'm not sure if there is a memory leak or something like
that going on here with the data load, but I believe I may have to force a
disconnect/reconnect for each table to force the connection to be active and
free up resources.

The database has been backed up and restored to ensure sweep time is minimal
here. I have also set FORCE SWEEP settings to be off, so the sweep should
be under my control.

Now in specific answer to Adam's questions...

> On 10-Nov-2007 07:43:37, wrote:
> Are you doing any slow transformations on each row?

No, there are no triggers or any form of data transforms being done in the
INSERT statement.

> Is the bottleneck on how quickly the ODBC source can give you the data
> rather than Firebird itself?

I suspect it is. I'll run some tests with different providers to see if
this is the case.

> What sort of data? My example was using 3 integer fields, but if you
> are pumping in really wide records I could see it being slower.

The data is about 75% text and 25% numeric. Some of the text fields can be
very big (in about 50% of all cases there are fields that can be up to
30,000 characters in length although its rare to see data of this size being
pumped over).

> Are there any indices being built as you import?

No, but I will test with a full index drop before doing the import to see if
it helps.

> Is your import tool being run on the database server or is it pulling
> or pushing data over a LAN?

I have found a major improvement when I push the data over a LAN vs. running
locally. Even with a local connection, howerver, I'm using FBCLIENT with
TCP/IP to and never the FBEmbed drivers.

> Is the ODBC data source competing for I/O (eg file on the same
> physical disk as the Firebird database file).

Yes, in some cases it is. I believe on one of the target platforms the user
has a RAID array for disk on that server, so how this physically is
implemented I'm not sure, but I wouldn't be surprised if that was the case.


Myles Wakeham
Director of Engineering
Tech Solutions USA, Inc.
Scottsdale, Arizona USA
Phone +1-480-451-7440

Try our new Outlook Utility 'Split Personality'