Subject Re: Performance decrease after running SET INDEX STATISTICS
Author accdev3234
--- In firebird-support@yahoogroups.com, Helen Borrie <helebor@...>
wrote:
>
> At 11:54 AM 28/03/2009, you wrote:
> >We have a strange scenario at one of our sites where one query slows
down dramatically after we call SET INDEX STATISTICS.
> >Unfortunately this is used a lot so it causing performance issues
throughout the system.
> >
> >The basic query is:
> >
> > SELECT DISTINCT BR.BASE_SN, R.TITLE
> > FROM RES R
> > JOIN BASERES BR ON BR.BASE_SN = R.BASE_SN
> > WHERE R.STATUS=1 AND BR.STATUS=1 AND
> > BR.BASE_SN IN (<insert list of primary keys>)
>
> Try this:
> SELECT DISTINCT BR.BASE_SN, R.TITLE
> FROM RES R
> JOIN BASERES BR ON BR.BASE_SN = R.BASE_SN
> AND BR.BASE_SN IN (<insert list of primary keys>)
> AND BR.STATUS=1
>
> WHERE R.STATUS=1
>
> and give the plans for both versions.
>

SQL> SET PLANONLY;

SQL> SELECT DISTINCT BR.BASE_SN, R.TITLE
CON> FROM RES R
CON> JOIN BASERES BR ON BR.BASE_SN = R.BASE_SN AND
CON> BR.BASE_SN IN
(8,71,99,130,310,356,426,529,537,689,19736,19792,19917,19929,20089,20094\
,20108,20110,20115,20191) AND BR.STATUS=1
CON> WHERE
CON> R.STATUS=1;

PLAN JOIN (R NATURAL,BR INDEX
(PK_BASERES,PK_BASERES,PK_BASERES,PK_BASERES,PK_BASERES,PK_BASERES,PK_BA\
SERES,PK_BASERES,PK_BASERES,PK_BASERES,
PK_BASERES,PK_BASERES,PK_BASERES,PK_BASERES,PK_BASERES,PK_BASERES,PK_BAS\
ERES,PK_BASERES,PK_BASERES,PK_BASERES,PK_BASERES))

SQL>
SQL> SELECT DISTINCT BR.BASE_SN, R.TITLE
CON> FROM RES R
CON> JOIN BASERES BR ON BR.BASE_SN = R.BASE_SN
CON> WHERE
CON> R.STATUS=1 AND BR.STATUS=1 AND
CON> BR.BASE_SN IN
(8,71,99,130,310,356,426,529,537,689,19736,19792,19917,19929,20089,20094\
,20108,20110,20115,20191);

PLAN SORT (JOIN (R NATURAL,BR INDEX
(PK_BASERES,PK_BASERES,PK_BASERES,PK_BASERES,PK_BASERES,PK_BASERES,PK_BA\
SERES,PK_BASERES,PK_BASERES,PK_BASERES,
PK_BASERES,PK_BASERES,PK_BASERES,PK_BASERES,PK_BASERES,PK_BASERES,PK_BAS\
ERES,PK_BASERES,PK_BASERES,PK_BASERES,PK_BASERES)))

> >
> >Both the BASERES and RES tables have about 25000 records. There is a
FK constraint R.BASE_SN -> BR.BASE_SN and BR.BASE_SN is the primary key
of the BASERES table.
> >The status columns are not indexed as there are only 2 distinct
values. However, I did try adding an index to see if that makes any
difference and it doesn't.
>
> No, it would make things worse in v.1.5.x.
>
>
> >For all our other sites this is not a problem and this site runs ok
after exporting and importing the data. Then, if we update the index
statistics then the performance drops.
> >For them, the query take 90secs but after export/import, the query
takes about 6secs. This returns just 334 records.
> >
> >If we change the WHERE to R.BASE_SN IN (<insert list of primary
keys>) then the query is again fast. Unfortunately, for all other DBs
tested from other sites,
> >the performance is worse when we change the alias.
> >
> >Using the clients database, it is easy to replicate.
> >
> >1. Load IBExpert and run query (fast)
> >2. Recalculate index statistics in IBExpert
> >3. Run query again - now slow.
>
> It's more than likely something to do with the caching. Test whether
running the same query a second time is just as slow.
>

Running the queries for a second time still gives the same timing. This
is using both IBExpert and isql.

> For your own edification, you could get some clues about the
usefulness (or not) of various approaches if you studied the plans used.
You might get more direct info about what the *optimizer* is doing by
using isql for those tests. AFAIK, IBExpert fiddles about with plans
according to its own rules and could well defeat better choices made by
the optimizer.
>
>
> >Basically fbserver gets locked at 100% CPU and noone else can use the
database.
> >This is using FB SS 1.5.5 on Windows and Linux.
> >
> >Any thoughts as how we can resolve this issue?
>
> For one thing, I don't think doing SET STATISTICS on the fly is
necessarily doing you any favours since it necessarily blows away any
benefits you have accumulated from the cache.
>
> For another, SELECT DISTINCT is a performance killer (esp. in v.1.5
and below) because it can't use an index at sort time. You might like
to try out GROUP BY for the sort instead and see if it improves things,
viz.
>
> Try this:
> SELECT BR.BASE_SN, R.TITLE
> FROM RES R
> JOIN BASERES BR ON BR.BASE_SN = R.BASE_SN
> AND BR.BASE_SN IN (<insert list of primary keys>)
> AND BR.STATUS=1
> WHERE R.STATUS=1
> GROUP BY 1,2
>

No difference.

> And, finally, for performance testing I'd stay away from elaborate
tools that do fancy things with your query specs. Get familiar with
isql for this kind of thing: you probably shun it because it is so
"bare-bones", but when "bare-bones" is what you really need, it's worth
the effort !
>
> ./heLen
>

Don't mind the "bare-bones" approach, just currently more familiar with
IBExpert. Will now get to to know isql more.

If I change the alias in the where clause (change BR.BASE_SN to
R.BASE_SN):

SQL> SELECT DISTINCT BR.BASE_SN, R.TITLE
CON> FROM RES R
CON> JOIN BASERES BR ON R.BASE_SN = BR.BASE_SN
CON> WHERE
CON> R.STATUS = 1 AND BR.STATUS = 1 AND
CON> R.BASE_SN IN
(8,71,99,130,310,356,426,529,537,689,19736,19792,19917,19929,20089,20094\
,20108,20110,20115,20191);

PLAN SORT (JOIN (R INDEX
(FK_RESOURCE_REF_2064_BASERESO,FK_RESOURCE_REF_2064_BASERESO,FK_RESOURCE\
_REF_2064_BASERESO,FK_RESOURCE_REF_2064_BASERESO,
FK_RESOURCE_REF_2064_BASERESO,FK_RESOURCE_REF_2064_BASERESO,FK_RESOURCE_\
REF_2064_BASERESO,FK_RESOURCE_REF_2064_BASERESO,
FK_RESOURCE_REF_2064_BASERESO,FK_RESOURCE_REF_2064_BASERESO,FK_RESOURCE_\
REF_2064_BASERESO,FK_RESOURCE_REF_2064_BASERESO,
FK_RESOURCE_REF_2064_BASERESO,FK_RESOURCE_REF_2064_BASERESO,FK_RESOURCE_\
REF_2064_BASERESO,FK_RESOURCE_REF_2064_BASERESO,
FK_RESOURCE_REF_2064_BASERESO,FK_RESOURCE_REF_2064_BASERESO,FK_RESOURCE_\
REF_2064_BASERESO,FK_RESOURCE_REF_2064_BASERESO),
BR INDEX (PK_BASERES)))

This is fast for this database but slow for others I tested against.
Maybe it is time to revisit the full query against the other databases
and see if they can be improved.

Thanks.