Subject | Re: Firebird Usage Load Problem |
---|---|
Author | Adam |
Post date | 2005-07-14T23:36:02Z |
Has anyone mentioned the differences between FB and the other DBs he
is talking about? Unfortunately the RSS feed limits at 15 records so
I only caught the end of the discussion.
Maurice, are you aware that Firebird's indices are directional. That
means that using a normal ascending index for a
select max(id) from tableA
would actually slow the query down.
If you are porting code from one DBMS to another (as we did in the
past), you need to understand this difference or you will never
understand why performance is not there.
Other differences are that indices will include records that are not
visible to your transaction. Multiversion records allows other
transactions to delete, modify or insert a new record that you are
unable to see. Because of this, operations such as
select count(*) from TableA
can not simply count the index nodes, but must visit each record in
the table to check whether this transaction is able to see it or not.
With something like
select count(*) from TableA where ID < 50,
There are very clever ways of handling this, so ask if you need to
know how this works.
The index will be used, but each record where the ID is below 50
(even in records not visible to your transactions) would be visited
and counted if applicable.
Other things of note is that FB 1.5 sometimes does silly things
inside join operations. Because of the way the statistics make it
look, sometimes the query may choose a plan that we know is going to
be a bad choice. Sometimes moving conditions into the join clause can
really help, other times where left joins are logically equivalent,
left joins can be used to force the execution order.
You can force FB to use a specific plan, BUT keep in mind it often
does make a good choice, and the choice it makes is affected by the
current data shape. If you define a plan when you build your
application, then in some cases you force it to perform worse.
Other things such as the skip syntax still has to internally do its
stuff, it just doesn't return the rows, so therefore it may not
perform as you expect.
My recommendation as a first step is to identify the culprit queries.
Since you have a script, you could probably log the length each query
takes and analyse that. I use IBPlanalyzer to check ALL my queries to
make sure they do not unneccessarily hit records and that the plan it
chooses makes sense. For example, just last week I was able to reduce
a report from 10 minutes to run to 5 seconds now. It wasn't overly
hard, the poor thing had to do a table scan on a 50000 record table
and was joining about 9 other tables. All I did was to change the
join order forcing one of the tables to go first where the where
clause knocked out all but a feww hundred records.
Now, other culprits to consider.
Do you use triggers? If so, when do they fire? Do they recursively
fire themselves? This is something to be very careful of.
Obviously, the original settings you had were far to high for CS. My
guess is that you set them like that because you incorrectly
diagnosed the problem to be a starvation of cache (which was clearly
not the case). I believe the problem will be the choice of plan of
some of the queries, heck it may only be one or two. The solution
will probably be to either add an additional index to something or to
optimise some of your queries.
Hope that helps.
Adam
is talking about? Unfortunately the RSS feed limits at 15 records so
I only caught the end of the discussion.
Maurice, are you aware that Firebird's indices are directional. That
means that using a normal ascending index for a
select max(id) from tableA
would actually slow the query down.
If you are porting code from one DBMS to another (as we did in the
past), you need to understand this difference or you will never
understand why performance is not there.
Other differences are that indices will include records that are not
visible to your transaction. Multiversion records allows other
transactions to delete, modify or insert a new record that you are
unable to see. Because of this, operations such as
select count(*) from TableA
can not simply count the index nodes, but must visit each record in
the table to check whether this transaction is able to see it or not.
With something like
select count(*) from TableA where ID < 50,
There are very clever ways of handling this, so ask if you need to
know how this works.
The index will be used, but each record where the ID is below 50
(even in records not visible to your transactions) would be visited
and counted if applicable.
Other things of note is that FB 1.5 sometimes does silly things
inside join operations. Because of the way the statistics make it
look, sometimes the query may choose a plan that we know is going to
be a bad choice. Sometimes moving conditions into the join clause can
really help, other times where left joins are logically equivalent,
left joins can be used to force the execution order.
You can force FB to use a specific plan, BUT keep in mind it often
does make a good choice, and the choice it makes is affected by the
current data shape. If you define a plan when you build your
application, then in some cases you force it to perform worse.
Other things such as the skip syntax still has to internally do its
stuff, it just doesn't return the rows, so therefore it may not
perform as you expect.
My recommendation as a first step is to identify the culprit queries.
Since you have a script, you could probably log the length each query
takes and analyse that. I use IBPlanalyzer to check ALL my queries to
make sure they do not unneccessarily hit records and that the plan it
chooses makes sense. For example, just last week I was able to reduce
a report from 10 minutes to run to 5 seconds now. It wasn't overly
hard, the poor thing had to do a table scan on a 50000 record table
and was joining about 9 other tables. All I did was to change the
join order forcing one of the tables to go first where the where
clause knocked out all but a feww hundred records.
Now, other culprits to consider.
Do you use triggers? If so, when do they fire? Do they recursively
fire themselves? This is something to be very careful of.
Obviously, the original settings you had were far to high for CS. My
guess is that you set them like that because you incorrectly
diagnosed the problem to be a starvation of cache (which was clearly
not the case). I believe the problem will be the choice of plan of
some of the queries, heck it may only be one or two. The solution
will probably be to either add an additional index to something or to
optimise some of your queries.
Hope that helps.
Adam