Subject Re: [firebird-support] Best table structure for Debit/Credit
Author Svein Erling Tysvaer
Matthias Hanft wrote:
> Svein Erling Tysvaer wrote:
>
>> Well, I only know that I don't get any such error with a similar
>> query and Firebird 1.5.3.
>
> If I use the query alone, there is no problem. It's just the
> use as a subquery...

My test was also with a subquery, that is

select a.a,
sum(
case
when a.a < 10 and a.b < 100
then 1
when a.a < 100 and a.b < 100 and
not exists(select * from b where b.c = a.c and b.a <> a.a)
then 2
else 3
end)
from a
where a.a < 100
group by 1

and it executes without problems.

> BTW, I remembered that I had already built a view which includes
> only payments (vonlypay), and another view which includes only
> the latest payment of each customer (vlastpay). Using that latter
> view in the "big" query with
>
> sum(
> case
> when (b.transtype = 'R') /* reminder */
> and
> (b.transdate > (select transdate from vlastpay v
> where company=2 and v.custno = b.custno))
> then 1
> else 0
> end
> ) as current_reminders
>
> gives the following extraordinary result (look at the "execute time"
> and the "fetches from cache"):
>
> Plan:
> PLAN SORT ((V VONLYPAY BILLING INDEX
>(RDB$FOREIGN16,RDB$FOREIGN16,RDB$FOREIGN16,RDB$FOREIGN16,RDB$FOREIGN16,RDB$FOREIGN16,RDB$FOREIGN16)))
> PLAN SORT (JOIN (B INDEX (RDB$FOREIGN15),T INDEX (RDB$PRIMARY1)))
> Adapted plan:
> PLAN SORT ((V VONLYPAY BILLING INDEX
>(INTEG_54,INTEG_54,INTEG_54,INTEG_54,INTEG_54,INTEG_54,INTEG_54)))
>PLAN SORT (JOIN (B INDEX (INTEG_47),T INDEX (INTEG_2)))
> ------ Performance info ------
> Prepare time = 31ms
> Execute time = 6h 28m 51s 703ms
> Avg fetch time = 686.226,56 ms
> Fetches from cache = -311.326.039
>
> Funny, eh? The results aren't very correct either :-)

Well, you managed to bring Firebird to its knees, though I don't yet
trust you managed to get Firebird produce incorrect results (that is a
big no-no and I believe such an error that would receive high priority
amongst the coders if it could be reproduced in a 'simple' test case).
Are you sure you selected the transdate for the last payment and not
only the transdate for *a* payment?

As for execution time and the plan, the indexes used for subselects are
far more important than the indexes for the first table(s) in the main
select. Add to that the likely problem of low selectivity for
RDB$FOREIGN16 and you get a plan that would make me reconsider the
select without even trying to run it (unless I knew the subselect would
only be executed a handful of times). Congratulations on having patience
to wait 6 hours.

>> Yes, I see, sorry. It is not all customers that are shown, but I
forgot that the same custno may exist in several companies. Change to
>>
>> and EXISTS(select * from billing b3
>> where b3.custno = b.custno
>> and b3.company = b.company
>> and b3.cleared+0 = 0)
>
> Yes, now it's the correct result. FYI:
>
> Plan:
> PLAN (B3 INDEX (ICOMPANYCUSTNOTRANSTYPE))
> PLAN SORT (JOIN (B INDEX (RDB$FOREIGN15),T INDEX (RDB$PRIMARY1)))
> Adapted plan:
> PLAN (B3 INDEX (ICOMPANYCUSTNOTRANSTYPE)) PLAN SORT (JOIN (B INDEX
(INTEG_47),T INDEX (INTEG_2)))
> ------ Performance info ------
> Prepare time = 32ms
> Execute time = 31s 78ms
> Avg fetch time = 914,06 ms
>
> Seems so. Meanwhile, I got the feeling that subSELECTs should be
> avoided whenever possible (for timing reasons), at least when used
> within aggregate functions (SUM) - I guess they're just executed
> way too often for good timing results...

Often subselects are just what you need, but if they're executed too
often and not against a unique index, I guess you're right. Your use of
HAVING at least hammered my EXISTS in this case.

Set