Subject Re: import data from external table (huge file)
Author zimrilin2000
Hi again!

I have made some more testing and find the problem:

I have tested to add a surrogate key, and make fields C and D unique
key. The result is the same, from file 26 the import time grows (file
35 takes 1 hour again). But if I don't define the unique key, all
files get imported in less than 2 minutes, so the overall process of
importing 35 million rows is made in exactly 1 hour.

So the problem must be the index is created in composite unique key
C, D, because field D has a low selectivity and the table has so many
rows, am I rigth?

I may not add the unique key (although the file may have repeated
values), but is there any another way of passing through this with
import not taking 10 hours or there is no other solution?

Thanks

Diego


--- In firebird-support@yahoogroups.com, "zimrilin2000"
<drodriguez@a...> wrote:
> Hi,
>
> I have enough space in the hard disk (about 300 Gb), and I did
> commit after executing the stored procedure.
>
> Past nigth I have tested the insertion in other way: I have split
> the file in 35 files with 1 million records in each file. Until
file
> number 26, files are imported in 2 minutes time (I insert a log
> record with the times in a table between inserts, after commiting),
> but file 27 is taking 20 minutes, and after that each file takes 5
> more minutes than previous file. File 35 takes 1 hour to import.
>
> The table definition is
>
> CREATE TABLE TABLE1(
> A VARCHAR(13),
> B VARCHAR(10),
> C VARCHAR(9),
> D VARCHAR(32),
> E SMALLINT,
> DATE1 DATE,
> FLAG SMALLINT,
> );
>
> The primary key is C and D. The selectivity of field D is poor, but
> it is no foreign key of any table, only composite key of this table
>
> Should I use a surrogate key for the table although is not used as
a
> foreign key?
>
> Am I reaching any limit I don't know?
>
> Any ideas/suggestions?
>
> Diego
>
>
>
>
>
> --- In firebird-support@yahoogroups.com, Helen Borrie
<helebor@t...>
> wrote:
> > At 03:58 PM 24/08/2005 +0000, you wrote:
> > >So, which is the limit of the table size and why is failing the
> > >import?
> >
> > Numbers as quoted - which is 2 BILLION not 2 million. Sorry
about
> that red
> > herring.
> >
> >
> > >I just have tried to split the file in two 17 million rows
files.
> The
> > >first file is copied into internal file without problems in 45
> > >minutes, but the second one has failed (the process is still
> running
> > >after 3 hours processing)
> >
> > How do you know it has failed?
> >
> > Did you hard-commit the first batch before commencing the second?
> >
> > ./heLen