Subject | Performance diff between insert...select and for select ... do? |
---|---|
Author | Kjell Rilbe |
Post date | 2015-02-13T13:55:09Z |
Hi,
I'm writing a utility that will need to do two things for each record in
an external table and for this purpose I use a for select ... do
construct in an execute block. I do it this way because external tables
can't be indexed and I will scan the entire external table anyway.
The two operations are:
1. Update one existing record in the target table.
2. Insert new record in the same target table.
In "steady state" the target table will contain about 20 million records
and the external table will contain about 10 thousand records.
But the first time I run this, the target table will be empty and the
external table will contain about 18 million records. The update will
never find a record to update during this first execution.
Would I lose a lot of hours if I use the same execute block/for select
construct the first time? The alternative would be to do a regular
insert into target table select from externaltable the first time. My
guess is that the unecessary update will take increasingly long to
execute as the target table grows towards 18 milliion records, wasting a
lot of time for me, even if a suitable index is present.
Just to give a measure of the system's performance as it is, a simple
update on one column for 18 million records in the target table
currently seems to take about 6 hours. But I could drop a couple of
indices and perhaps I should bump up the page size from 4k to 8k or 16k.
Regards,
Kjell
--
Kjell Rilbe
Telefon: 08-761 06 55
Mobil: 0733-44 24 64
DataDIA AB
Ulvsundavägen 106
168 67 Bromma
www.datadia.se <http://www.datadia.se>
08-514 905 90
Företagskontakt.se <http://xn--fretagskontakt-vpb.se> - urval av företag
och kontaktinformation
Personkontakt.se <http://personkontakt.se> - urval av hushållsadresser
I'm writing a utility that will need to do two things for each record in
an external table and for this purpose I use a for select ... do
construct in an execute block. I do it this way because external tables
can't be indexed and I will scan the entire external table anyway.
The two operations are:
1. Update one existing record in the target table.
2. Insert new record in the same target table.
In "steady state" the target table will contain about 20 million records
and the external table will contain about 10 thousand records.
But the first time I run this, the target table will be empty and the
external table will contain about 18 million records. The update will
never find a record to update during this first execution.
Would I lose a lot of hours if I use the same execute block/for select
construct the first time? The alternative would be to do a regular
insert into target table select from externaltable the first time. My
guess is that the unecessary update will take increasingly long to
execute as the target table grows towards 18 milliion records, wasting a
lot of time for me, even if a suitable index is present.
Just to give a measure of the system's performance as it is, a simple
update on one column for 18 million records in the target table
currently seems to take about 6 hours. But I could drop a couple of
indices and perhaps I should bump up the page size from 4k to 8k or 16k.
Regards,
Kjell
--
Kjell Rilbe
Telefon: 08-761 06 55
Mobil: 0733-44 24 64
DataDIA AB
Ulvsundavägen 106
168 67 Bromma
www.datadia.se <http://www.datadia.se>
08-514 905 90
Företagskontakt.se <http://xn--fretagskontakt-vpb.se> - urval av företag
och kontaktinformation
Personkontakt.se <http://personkontakt.se> - urval av hushållsadresser