Subject Re: [firebird-support] Performance problem with grouping/aggregate functions
Author Svein Erling Tysvaer
Hi Phil!

Why not simply get rid of the 'subaggregation'?

SELECT SUM(AV2.VALUEOFASSET)
FROM ASSETVALUES AV2
WHERE NOT EXISTS(
SELECT * FROM WRITEOFFS WO
JOIN ASSETS A ON A.ASSET_SN = WO.ASSET_SN
JOIN ASSETVALUES AV ON AV.ASSET_SN = A.ASSET_SN
WHERE AV.ASSET_SN = AV2.ASSET_SN
AND AV.ASSETVALUE_SN > AV2.ASSETVALUE_SN)
AND EXISTS(
SELECT * FROM WRITEOFFS WO
JOIN ASSETS A ON A.ASSET_SN = WO.ASSET_SN
WHERE AV2.ASSET_SN = A.ASSET_SN)

Generally, the optimizer tries to transform all IN (<subselect>) into
EXISTS(<subselect>). IN (SELECT MAX...) may be a situation where it
isn't yet able to do so, and then your entire subselect is done once for
every record in ASSETVALUES - needless to say, an expensive and
unneccessary operation. I believe the above query to be equivalent of
what you wrote, but it should be much, much faster.

Maybe you could speed up even further by taking some of the joined
tables out of the subselects and into the main select, but I do not know
whether ASSET_SN is unique in ASSETS and/or WRITEOFFS or if there are
records in ASSETVALUES that doesn't exist in ASSET/WRITEOFFS.

HTH,
Set

phil_hhn wrote:
> Hi, I am having trouble with query performance - maybe someone can
> suggest a better way? (Using SS 1.5.3)
>
> We have a table of ASSETS, which get depreciated every year, so for
> each year the new value of each asset is added to an ASSETVALUES
> table. If we write off an asset we add a record to a WRITEOFFS table
> (the record has a fk to the ASSET pk). To list what the value of each
> asset was when it was written off, we simply run:
>
> SELECT MAX(AV.ASSETVALUE_SN) FROM WRITEOFFS WO
> JOIN ASSETS A ON A.ASSET_SN = WO.ASSET_SN
> JOIN ASSETVALUES AV ON AV.ASSET_SN = A.ASSET_SN
> GROUP BY AV.ASSET_SN
>
> This works fine and is sub-second (it also assumes the max(pk) will
> always be the recent asset value - maybe conceptually dodgy (if the
> pks didn't increase sequentially), but in practice this is fine ;-) ).
> Next we want to sum all the assetvalues so we encase the select in the
> following:
>
> SELECT SUM(AV2.VALUEOFASSET) FROM ASSETVALUES AV2 WHERE
> AV2.ASSETVALUE_SN IN (<select query above>)
>
> Performance goes out the door. There are only 10 pk's returned by the
> first select, but to run the entire thing takes 80 seconds (and that
> is on a fast server, with only an average-sized database). If I take
> the 10 pk's and manually put them into the 'IN' clause, the query is
> again instant.
>
> Is there a way to force the optimiser/parser or something to work more
> efficiently?
> And/or is there a better way to run this combination of grouping and
> aggregate functions?
>
> Thanks,
> Phil