Subject Performance on big tables
Author rabt@dim.uchile.cl
Hi everybody.

I'm migrating from MySql to Firebird 1.5.1 (SS) on win2k (P4, 2Ghz, 646Mb RAM).
I've been reading all Firebird documentation and now I'm very excited about all
the features shown there. Being a newbie on this, so far I've managed to dump
all my data in a comfortable way using perl and external tables, and then
re-inserting it into the final tables. Right now I have 12 "almost" identical
monthly tables of 8000000 rows containing (mostly) numerical data in 99 columns
each, all indexed in a common field "id". I'm trying to do some statistics
reports with them. What I need is to retrieve the annual sum of a certain
column "value" using a multiple full outer join and a group by. I'm using
coalesce and iNvl to deal with nulls.

My query looks like this:

select
coalesce(a1.id,a2.id,...,a12.id) as id,
sum(iNvl(a1.value,0)+iNvl(a2.value,0)+...+iNvl(a12.value)) as value
from table1 a1
full outer join table2 a2 on a1.id = a2.id
full outer join table3 a3 on a1.id = a3.id
...
full outer join table12 a12 on a1.id = a12.id
group by coalesce(a1.id,a2.id,...,a12.id)

I tried this on a small set of tables and it works like a charm, but on the
bigger ones it takes forever. The query eats all CPU power (like 99%) so I have
to shutdown the server to kill the query (I've waited more than 3 hours!!).

I have not touched firebird.conf, so I guess it's using it's defaults. I do not
make any inserts, deletes or updates (yet), just selects. I've created the
table just a week ago and I'm working alone, locally. Excepting "id" which is
INTEGER NOT NULL, almost all columns are defined as NUMERIC(18,0) NOT NULL.

I've tried disabling sweep interval, and changing buffers with gfix to improve
performance, and nothing. By the way, memory usage seems to stay low all the
time during the query.

What should I check to improve performance?.

Thanks in advance,

Rodrigo Abt B.
Statistical advisor, SII,
Chile.

P.S.: The data I'm using contains a not so small amount (about 2500) of
duplicates on the index field in each table, which I should not remove.


----------------------------------------------------------------
This message was sent using IMP, the Internet Messaging Program.