Subject Re: import data from external table (huge file)
Author zimrilin2000
But the number of rows in the file is 35 millions, much less than
2,147,483,648.

what do you mean by absolute number of row slots?



--- In firebird-support@yahoogroups.com, Helen Borrie <helebor@t...>
wrote:
> At 11:43 AM 24/08/2005 +0000, you wrote:
> >Hi all!
> >
> > I'm trying to import the data from a 2.6 Gb fixed length format
> >external file with 35 million records into a regular table inside
the
> >database. Each line in the external table file is 76 characters
long.
> >I'm using a stored procedure to do the import, because I need to
> >detect and store duplicates from the file in another table
(violation
> >of table primary key) and continue with the processing
> >
> >I execute the SP (EXECUTE PROCEDURE INSERT_EXTERNAL_TABLE) with EMS
> >QuickDesk and after 2 hour processing (the bdd file grows until it
> >gets 5 Gb size) I get the following error:
> >
> >
> >EXECUTE PROCEDURE INSERT_EXTERNAL_TABLE
> >fmSQLScript.Script:
> >Unsuccessful execution caused by a system error that precludes
> >successful execution of subsequent statements.
> >I/O error for file "D:\SRC\INSERTFILES\DATA.DAT".
> >Error while trying to open file.
> >El dispositivo no reconoce el comando.
> >.
> >
> >I'm using Firebird Firebird-1.5.2.4731_win32
> >
> >This is the stored procedure
> >
> >CREATE PROCEDURE INSERT_EXTERNAL_TABLE
> >AS
> > DECLARE VARIABLE A VARCHAR(13);
> > DECLARE VARIABLE B VARCHAR(10);
> > DECLARE VARIABLE C VARCHAR(9);
> > DECLARE VARIABLE D VARCHAR(10);
> > DECLARE VARIABLE E VARCHAR(20);
> >
> >BEGIN
> > /* Procedure body */
> >
> > FOR SELECT A,B,C,D,E FROM EXTERNAL_TABLE
> > INTO :A, :B, :C, :D, :E
> > DO
> > BEGIN
> > INSERT INTO TABLE (A,B,C,D,E)
> > VALUES (:A, :B, :C, :D, :E);
> >
> > WHEN SQLCODE -803 DO
> > BEGIN
> > INSERT INTO TABLE_TMP(A,B,C,D,E)
> > VALUES (:A, :B, :C, :D, :E);
> > END
> > END
> >
> > EXIT;
> >END
> >
> >Is the file too big to do the import?
>
> Yes. However, unless you are using a FAT32 or an NTFS 4 disk
partition,
> it's not the size of the file, but the number of rows being read.
The
> absolute number of row slots for a single table is 2^32 (just over
2
> million). So - once the SP has walked past 2,147,483,648th row (or
perhaps
> a few less) the number is up, literally.
>
> >Is there any other way of doing the import?
>
> You will need to split the input file into much smaller chunks.
But you
> will encounter the same limit on row slots, eventually, when
writing to the
> internal tables. If the internal tables will be subject to
updates,
> deletes and/or further inserts, they will need to be a sufficient
degree
> smaller than 2.148 million rows to allow for the existence of
multiple
> record versions and retain capacity for increase. Somehow you will
need to
> find a way to spread these records across multiple tables by some
kind of
> grouping.
>
> Another thing, that is a problem with trying to process a huge file
in one
> hit, is the strain on system resources from having so many
uncommitted
> inserts. You also need to break this task into separate
transactions, so
> that you can cleanly commit batches of about 8000 - 10,000 records.
>
> /heLen