Subject | [firebird-support] Re: derived table within EXISTS (was: Different performance Superserver VS. Embeded ?) |
---|---|
Author | Svein Erling Tysvær |
Post date | 2008-02-25T12:08:52Z |
Hi Dany!
How complex things can become once the options increase. I don't use Fb 2.1 myself (I'm still on Fb 1.5.4), but I'm not shocked to hear that creating a derived table (I may get the terminology wrong) within an EXISTS can be very slow, I think the idea with allowing this was to facilitate SELECT ... FROM (SELECT...), rather than EXISTS(SELECT ... FROM (SELECT...)). Firebird has to choose between creating the derived table 18286 times, or to create a derived table that consists of all records having MOVI.TIPO = 'B'. And I do not know whether the optimizer has become smart enough to consider this last option a possibility.
If I understand your original query correctly, it ought to be rewritten to something like
select p.PROC, p.EXP1 as EXPE, p.ACTO, p.DEMA, p.OBSE, t.DSCR as D_TPRO
from PROC p
left join TPRO t on p.TPRO=t.TPRO
where exists(select * from MOVI m
where p.PROC = m.PROC
and m.TIPO = 'B')
order by lower(p.ACTO)
I'm sure this query executes almost instantly.
In the particular case you supply, I'd expect this transformation to be easy to do even if it is an automatically generated query, and I think your testing suggests that
... WHERE EXISTS((SELECT FieldValue FROM ...) X WHERE X.FieldValue ...)
ought to be avoided when possible (and I would expect it to be possible to rewrite all such queries).
The optimizer in Firebird has become continually better since InterBase was open sourced, but I know the optimizer not too rarely choose a strange (read: time consuming) plan with Firebird 1.5.4, and it wouldn't surprise me if there were still situations in Fb 2.1 where the optimizer didn't manage to simplify the SQL sufficiently (I almost consider your original query an 'error' with its unnecessary complexity, although it can be a useful test case - worth reporting to the development team?).
HTH,
Set
-----Original Message-----
From: firebird-support@yahoogroups.com [mailto:firebird-support@yahoogroups.com] On Behalf Of danyschaer
Sent: 25. februar 2008 11:56
To: firebird-support@yahoogroups.com
Subject: [firebird-support] Re: Different performance Superserver VS. Embeded ?
Hi Set and Helen,
Thank you for answering. The case is I have just one week testing
Firebird, and trying to make our large application compatible with
it. Our proyect is tested with Oracle (100%) and PostgreSQL (80%),
and now we started testing Firebird one week ago and we can see all
our tables are running very well using it.
Problems comes when we start trying our full database query system,
in where users can choice a lot of filters, alone or togheter, over
main an related tables. In that situation, I am sure we are not doing
exactly the best "select" for each situation, but we tested a very
big number of combinations with Oracle, an always he need no more
than 5 seconds for giving us the full response.
Doing just onew combination with Firebird 2.1, we found he needs 10
minutes the fires time, and 2 seconds if we repeat the same query
later. Thanks Helen, I noted my firs input comparing embeded vs.
superserver was wrong, just because I did my first test using
embeded, and then I used Superserver, inmediatly over the same
database.
I found the cache is alive even if I shut down the Firebird Server;
the only way I found to clear the previous query from cache is by
rebooting the machine. (BTW is there another way to clear it???).
I am verifing the performance running our query expresion with
FlameRobin. The first run need near 10 minutes; next tests runs
faster.
Next I will copy for you the report I get from FlameRobin. Table PROC
hast 18.286 records; table MOVI has 154093.
Starting transaction...
Preparing query:
select "PROC"."PROC","PROC"."EXP1" as "EXPE","PROC"."ACTO",
"PROC"."DEMA","PROC"."OBSE","TPRO"."DSCR" as "D_TPRO"
from "PROC"
left outer join "TPRO" on "PROC"."TPRO"="TPRO"."TPRO"
where ((exists (select "PROC" from
(select "MOVI"."PROC"
from "MOVI"
where "MOVI"."TIPO" = 'B') T
Where T."PROC" = "PROC"."PROC")))
order by lower("PROC"."ACTO")
Prepare time: 00:00:01.
Field #01: PROC.PROC Alias:PROC Type:STRING(10)
Field #02: PROC.EXP1 Alias:EXPE Type:STRING(18)
Field #03: PROC.ACTO Alias:ACTO Type:STRING(200)
Field #04: PROC.DEMA Alias:DEMA Type:STRING(200)
Field #05: PROC.OBSE Alias:OBSE Type:STRING(120)
Field #06: TPRO.DSCR Alias:D_TPRO Type:STRING(120)
PLAN (T MOVI INDEX (MOVI_PROC))
PLAN SORT (JOIN (PROC NATURAL, TPRO INDEX (TPRO_PK)))
Executing...
Done.
429914 fetches, 8 marks, 139758 reads, 8 writes.
0 inserts, 0 updates, 0 deletes, 151675 index, 33586 seq.
Delta memory: 49648 bytes.
Execute time: 00:09:39.
Please, let me know if you need more information. I can upload a copy
of the database if you want. (Empty: 600 kb)
Thank you very much.
Dany
How complex things can become once the options increase. I don't use Fb 2.1 myself (I'm still on Fb 1.5.4), but I'm not shocked to hear that creating a derived table (I may get the terminology wrong) within an EXISTS can be very slow, I think the idea with allowing this was to facilitate SELECT ... FROM (SELECT...), rather than EXISTS(SELECT ... FROM (SELECT...)). Firebird has to choose between creating the derived table 18286 times, or to create a derived table that consists of all records having MOVI.TIPO = 'B'. And I do not know whether the optimizer has become smart enough to consider this last option a possibility.
If I understand your original query correctly, it ought to be rewritten to something like
select p.PROC, p.EXP1 as EXPE, p.ACTO, p.DEMA, p.OBSE, t.DSCR as D_TPRO
from PROC p
left join TPRO t on p.TPRO=t.TPRO
where exists(select * from MOVI m
where p.PROC = m.PROC
and m.TIPO = 'B')
order by lower(p.ACTO)
I'm sure this query executes almost instantly.
In the particular case you supply, I'd expect this transformation to be easy to do even if it is an automatically generated query, and I think your testing suggests that
... WHERE EXISTS((SELECT FieldValue FROM ...) X WHERE X.FieldValue ...)
ought to be avoided when possible (and I would expect it to be possible to rewrite all such queries).
The optimizer in Firebird has become continually better since InterBase was open sourced, but I know the optimizer not too rarely choose a strange (read: time consuming) plan with Firebird 1.5.4, and it wouldn't surprise me if there were still situations in Fb 2.1 where the optimizer didn't manage to simplify the SQL sufficiently (I almost consider your original query an 'error' with its unnecessary complexity, although it can be a useful test case - worth reporting to the development team?).
HTH,
Set
-----Original Message-----
From: firebird-support@yahoogroups.com [mailto:firebird-support@yahoogroups.com] On Behalf Of danyschaer
Sent: 25. februar 2008 11:56
To: firebird-support@yahoogroups.com
Subject: [firebird-support] Re: Different performance Superserver VS. Embeded ?
Hi Set and Helen,
Thank you for answering. The case is I have just one week testing
Firebird, and trying to make our large application compatible with
it. Our proyect is tested with Oracle (100%) and PostgreSQL (80%),
and now we started testing Firebird one week ago and we can see all
our tables are running very well using it.
Problems comes when we start trying our full database query system,
in where users can choice a lot of filters, alone or togheter, over
main an related tables. In that situation, I am sure we are not doing
exactly the best "select" for each situation, but we tested a very
big number of combinations with Oracle, an always he need no more
than 5 seconds for giving us the full response.
Doing just onew combination with Firebird 2.1, we found he needs 10
minutes the fires time, and 2 seconds if we repeat the same query
later. Thanks Helen, I noted my firs input comparing embeded vs.
superserver was wrong, just because I did my first test using
embeded, and then I used Superserver, inmediatly over the same
database.
I found the cache is alive even if I shut down the Firebird Server;
the only way I found to clear the previous query from cache is by
rebooting the machine. (BTW is there another way to clear it???).
I am verifing the performance running our query expresion with
FlameRobin. The first run need near 10 minutes; next tests runs
faster.
Next I will copy for you the report I get from FlameRobin. Table PROC
hast 18.286 records; table MOVI has 154093.
Starting transaction...
Preparing query:
select "PROC"."PROC","PROC"."EXP1" as "EXPE","PROC"."ACTO",
"PROC"."DEMA","PROC"."OBSE","TPRO"."DSCR" as "D_TPRO"
from "PROC"
left outer join "TPRO" on "PROC"."TPRO"="TPRO"."TPRO"
where ((exists (select "PROC" from
(select "MOVI"."PROC"
from "MOVI"
where "MOVI"."TIPO" = 'B') T
Where T."PROC" = "PROC"."PROC")))
order by lower("PROC"."ACTO")
Prepare time: 00:00:01.
Field #01: PROC.PROC Alias:PROC Type:STRING(10)
Field #02: PROC.EXP1 Alias:EXPE Type:STRING(18)
Field #03: PROC.ACTO Alias:ACTO Type:STRING(200)
Field #04: PROC.DEMA Alias:DEMA Type:STRING(200)
Field #05: PROC.OBSE Alias:OBSE Type:STRING(120)
Field #06: TPRO.DSCR Alias:D_TPRO Type:STRING(120)
PLAN (T MOVI INDEX (MOVI_PROC))
PLAN SORT (JOIN (PROC NATURAL, TPRO INDEX (TPRO_PK)))
Executing...
Done.
429914 fetches, 8 marks, 139758 reads, 8 writes.
0 inserts, 0 updates, 0 deletes, 151675 index, 33586 seq.
Delta memory: 49648 bytes.
Execute time: 00:09:39.
Please, let me know if you need more information. I can upload a copy
of the database if you want. (Empty: 600 kb)
Thank you very much.
Dany