Subject RE: [ib-support] Summarization Query Speed Issue
Author sugi
Dear all,

Many thanks to mr Daniel Rail, mr Arno Brinkman, and mr Svein E Tysvaer
for taking the time to respond to my inquiries. I'm summarizing the
suggestions given and my results below.

> Is Product.ProductID unique ?
Yes. ProductID is the primary key, integer field.

> If yes how does this perform :
> SELECT
> p.productID,
> (SELECT SUM(o.qty) FROM OrderDetail o
> WHERE o.productid = p.productid)
> FROM Product p
> ORDER BY p.productID
PLAN (O INDEX (FK_ORDERDET_REFERENCE_ITEMS)) PLAN (P NATURAL)
From my limited testing, the performance is quite good, around 2
seconds. Now this baffles me since I was told once that 'correlated
subquery' could not be faster than a straightforward join...

I don't know anything about how the internal optimizer works, so this
might be asking too much, but is it possible to further improve the
optimizer so that the execution time of the two queries do not differ
too much? Um... but maybe this is a subject better suited for
firebird-devel list instead of ib-support.

> select p.productID, sum(o.qty)
> from product p
> left join orderdetail o on (p.productID = o.productID)
> join ordermaster m on (o.ordermasterid = m.ordermasterid and
> m.transdate between '2003-01-01' and '2003-02-01')
> group by p.productID
PLAN SORT (JOIN (JOIN (P NATURAL,
O INDEX (FK_ORDERDET_REFERENCE_ITEMS)),
M INDEX (PK_ORDERMASTER)))
Execution time is around 50 seconds. (As opposed to ~60 and ~51 seconds
for the previous version). An improvement, but not much.

Based on mr Arno Brinkman's suggestion, here's the 'correlated subquery'
version that I came up with:
...
SELECT
p.productID,
(SELECT SUM(o.qty)
FROM OrderDetail o join ordermaster m
on (o.ordermasterid = m.ordermasterid)
and (m.orderdate between '2003-01-01' and '2003-01-02')
WHERE o.productid = p.productid)
FROM
Product p
...
PLAN JOIN (O INDEX (FK_ORDERDET_REFERENCE_ITEMS),M INDEX
(PK_ORDERMASTER)) PLAN (P NATURAL)
Execution time is almost instant (around 1 seconds), the fetching is a
different story altogether...:)

Since the performance difference is quite big, does this mean that I
have to rewrite all my LEFT JOIN queries?

Thank you all for your inputs and suggestions.
sugi.