Subject | Performance problem |
---|---|
Author | Richard Wesley |
Post date | 2007-10-31T17:18:23Z |
Hi All -
I have a scientific data set (Jim Hansen's global temperature anomaly
grid) which I am trying to do a plot with. When I run a query to
generate the plot data from an embedded Firebird 2.0.3 database, it
takes about 25 minutes:
SELECT ((CASE WHEN (F_INTEGERABS("LandOceanTemps"."Lat (deg)") >
CAST(66.5 AS DOUBLE PRECISION)) THEN _UNICODE_FSS 'Polar'
WHEN (F_INTEGERABS("LandOceanTemps"."Lat (deg)") <
CAST(23.5 AS DOUBLE PRECISION)) THEN _UNICODE_FSS 'Tropical'
ELSE _UNICODE_FSS 'Temperate' END)) AS
"none:Climate Zone:nk",
((CASE WHEN ("LandOceanTemps"."Lat (deg)" >= 0) THEN
_UNICODE_FSS 'Northern' ELSE _UNICODE_FSS 'Southern' END)) AS
"none:Hemisphere:nk",
(EXTRACT(YEAR FROM "LandOceanTemps"."Date")) AS "yr:Date:qk",
(AVG(CAST("LandOceanTemps"."Anomaly" AS DOUBLE PRECISION))) AS
"avg:Anomaly:qk"
FROM "LandOceanTemps"
GROUP BY 1,
2,
3
The query plan is.
PLAN SORT ((LandOceanTemps NATURAL))
The same query against a MySQL database containing the original data
runs in about 47 seconds:
SELECT ((CASE WHEN (ABS(`LandOceanTemps`.`Lat (deg)`) > 66.5)
THEN 'Polar'
WHEN (ABS(`LandOceanTemps`.`Lat
(deg)`) < 23.5) THEN 'Tropical'
ELSE 'Temperate' END)) AS
`none:Climate Zone:nk`,
((CASE WHEN (`LandOceanTemps`.`Lat (deg)` >= 0) THEN
'Northern' ELSE 'Southern' END)) AS `none:Hemisphere:nk`,
(YEAR(`LandOceanTemps`.`Date`)) AS `yr:Date:qk`,
(AVG(`LandOceanTemps`.`Anomaly`)) AS `avg:Anomaly:qk`
FROM `LandOceanTemps`
GROUP BY 1,
2,
3
The Firebird schema is
CREATE TABLE "LandOceanTemps" (
"Anomaly" DOUBLE PRECISION,
"Date" DATE,
"Lat (deg)" INTEGER,
"Long (deg)" INTEGER
);
CREATE INDEX "_tidx_128_2a" ON "LandOceanTemps"."Date" ASC;
CREATE INDEX "_tidx_128_3a" ON "LandOceanTemps"."Lat (deg)" ASC;
CREATE INDEX "_tidx_128_4a" ON "LandOceanTemps"."Long (deg)" ASC;
The table has 10,886,400 rows. The dates are all month precision;
lat/long are on a 2 degree grid and the Anomaly values range from
-11.7 to 14.09.
Any ideas on why this is 50x slower? It is just numbers!
TIA,
________________________________________________________
Richard Wesley Senior Software Developer Tableau
Software
Visit: http://www.trytableau.com/now.html
I have a scientific data set (Jim Hansen's global temperature anomaly
grid) which I am trying to do a plot with. When I run a query to
generate the plot data from an embedded Firebird 2.0.3 database, it
takes about 25 minutes:
SELECT ((CASE WHEN (F_INTEGERABS("LandOceanTemps"."Lat (deg)") >
CAST(66.5 AS DOUBLE PRECISION)) THEN _UNICODE_FSS 'Polar'
WHEN (F_INTEGERABS("LandOceanTemps"."Lat (deg)") <
CAST(23.5 AS DOUBLE PRECISION)) THEN _UNICODE_FSS 'Tropical'
ELSE _UNICODE_FSS 'Temperate' END)) AS
"none:Climate Zone:nk",
((CASE WHEN ("LandOceanTemps"."Lat (deg)" >= 0) THEN
_UNICODE_FSS 'Northern' ELSE _UNICODE_FSS 'Southern' END)) AS
"none:Hemisphere:nk",
(EXTRACT(YEAR FROM "LandOceanTemps"."Date")) AS "yr:Date:qk",
(AVG(CAST("LandOceanTemps"."Anomaly" AS DOUBLE PRECISION))) AS
"avg:Anomaly:qk"
FROM "LandOceanTemps"
GROUP BY 1,
2,
3
The query plan is.
PLAN SORT ((LandOceanTemps NATURAL))
The same query against a MySQL database containing the original data
runs in about 47 seconds:
SELECT ((CASE WHEN (ABS(`LandOceanTemps`.`Lat (deg)`) > 66.5)
THEN 'Polar'
WHEN (ABS(`LandOceanTemps`.`Lat
(deg)`) < 23.5) THEN 'Tropical'
ELSE 'Temperate' END)) AS
`none:Climate Zone:nk`,
((CASE WHEN (`LandOceanTemps`.`Lat (deg)` >= 0) THEN
'Northern' ELSE 'Southern' END)) AS `none:Hemisphere:nk`,
(YEAR(`LandOceanTemps`.`Date`)) AS `yr:Date:qk`,
(AVG(`LandOceanTemps`.`Anomaly`)) AS `avg:Anomaly:qk`
FROM `LandOceanTemps`
GROUP BY 1,
2,
3
The Firebird schema is
CREATE TABLE "LandOceanTemps" (
"Anomaly" DOUBLE PRECISION,
"Date" DATE,
"Lat (deg)" INTEGER,
"Long (deg)" INTEGER
);
CREATE INDEX "_tidx_128_2a" ON "LandOceanTemps"."Date" ASC;
CREATE INDEX "_tidx_128_3a" ON "LandOceanTemps"."Lat (deg)" ASC;
CREATE INDEX "_tidx_128_4a" ON "LandOceanTemps"."Long (deg)" ASC;
The table has 10,886,400 rows. The dates are all month precision;
lat/long are on a 2 degree grid and the Anomaly values range from
-11.7 to 14.09.
Any ideas on why this is 50x slower? It is just numbers!
TIA,
________________________________________________________
Richard Wesley Senior Software Developer Tableau
Software
Visit: http://www.trytableau.com/now.html