Subject Re: [firebird-support] Performance problem
Author Richard Wesley
Alexandre -

Thanks for the response.

On 31 Oct 2007, at 11:26, Alexandre Benson Smith wrote:

> Richard Wesley wrote:
>> Hi All -
>>
>> I have a scientific data set (Jim Hansen's global temperature anomaly
>> grid) which I am trying to do a plot with. When I run a query to
>> generate the plot data from an embedded Firebird 2.0.3 database, it
>> takes about 25 minutes:
>>
...
>> The query plan is.
>>
>> PLAN SORT ((LandOceanTemps NATURAL))
>>
>
> I would not expected anything different, since you have no where
> clause.

I was just being complete. I expected a table scan.

>> The same query against a MySQL database containing the original data
>> runs in about 47 seconds:
>>
>
> MySQL is a complete different engine, even different between each
> storage engine used, some of they don't care about transaction
> isolation, and the like, once there is a lote less overhead to handle.

Yes, but I was surprised at the magnitude of the difference (50x).

>> The Firebird schema is
>>
>> CREATE TABLE "LandOceanTemps" (
>> "Anomaly" DOUBLE PRECISION,
>> "Date" DATE,
>> "Lat (deg)" INTEGER,
>> "Long (deg)" INTEGER
>> );
>>
>> CREATE INDEX "_tidx_128_2a" ON "LandOceanTemps"."Date" ASC;
>> CREATE INDEX "_tidx_128_3a" ON "LandOceanTemps"."Lat (deg)" ASC;
>> CREATE INDEX "_tidx_128_4a" ON "LandOceanTemps"."Long (deg)" ASC;
>>
>
> I don't expect this indices be of any help here, since all data
> must be
> read (no WHERE clause) is usually faster to read it all in storage
> order
> and sort them in memory.

Since there are only about 350 or so result rows, this seems reasonable.

>> The table has 10,886,400 rows. The dates are all month precision;
>> lat/long are on a 2 degree grid and the Anomaly values range from
>> -11.7 to 14.09.
>>
>> Any ideas on why this is 50x slower? It is just numbers!
>>
>
> Nope, but If I could think in something to try to increase the
> performance, I would increase the sort memory space on firebird.conf,

The number of result rows is relatively small and I would expect that
computing averages over such a small set would be IO bound (memory
should just be the grouping values, a sum and a count.) Or is there
something else going on here that more memory would help?

> and be sure there is no garbage left on the database.

The data was inserted once and only selects are being performed so I
suspect this is not an issue.

________________________________________________________
Richard Wesley Senior Software Developer Tableau
Software
Visit: http://www.trytableau.com/now.html