Subject Re: Mystery
Author Tom Conlon
Hi Helen,

Thanks for you response.

After running the query for the first time (1.5mins) If I stop and
restart the FB service the query is still very quick soo it seems an
OS cache comes into play here...?

> You should always try to parameterize any static queries. Prepare..

Not possible as it is a general user-defined query tool. One query to
the next may be quite different with unpredictable criteria.

> And, because this is all inner joins, I'd shift all search arguments
> into the search clause and test whether that improves the prepare time.

Tried it and no real difference.

Also tried upping DefaultDbCachePages to 8192 in firebird.conf but no
improvement either.

> Understand that the cache is used to store previously accessed pages
> and indexes. Do make sure you have a sufficiently large cache of
> physical RAM on hand to keep that many pages useful. A cache that is
> constantly being paged out to virtual memory because it's too small,
> or because physical RAM is insufficient, is not terribly useful.

2GB ram on my dev machine, 3.8GB ram on live server.
Similar performance seen:

Plan
PLAN JOIN (T1 INDEX (RDB$FOREIGN81),C INDEX (RDB$PRIMARY15),S INDEX
(IDXSITEPOSTCODE,RDB$PRIMARY13),CO INDEX (RDB$PRIMARY8))

Adapted Plan
PLAN JOIN (T1 INDEX (FK_CLNTSKILL1),C INDEX (PK_CLIENT),S INDEX
(IDXSITEPOSTCODE,PK_SITE),CO INDEX (PK_COMPANY))

------ Performance info ------
Prepare time = 16ms
Execute time = 1m 26s 125ms
Avg fetch time = 10,765.63 ms
Current memory = 2,706,972
Max memory = 2,756,372
Memory buffers = 8,192
Reads from disk to cache = 28,605
Writes from cache to disk = 0
Fetches from cache = 434,178

>>2. Can I fill the cache ahead of time but issuing a clntskills query
> JOIN clntskill t1 ON c.clntid=t1.clntid

Would issuing:
SELECT t1.clntid, cl.surname
FROM clntskill t1
JOIN client cl ON ON c.clntid=t1.clntid

Fill cache with data that could be used in subsequent 'JOIN clntskill
t1 ON c.clntid=t1.clntid' type queries?

> If these tables are subject to frequent maintenance and you're not
> doing anything currently to housekeep indexes, now might be a good
> time to start.

Ok - SET STATISTICS and/or ACTIVATE/DEACTIVATE INDEX xyz?


Tom

--- In firebird-support@yahoogroups.com, Helen Borrie <helebor@...> wrote:
>
> At 04:30 AM 2/06/2007, you wrote:
> >Hi,
> >
> >SELECT c.clntid, c.surname, c.forename, c.title, c.siteid, c.email ,
> >co.compname
> >FROM Client c
> >JOIN Site s ON c.siteid=s.siteid
> >JOIN Company co ON s.compid=co.compid
> >JOIN clntskill t1 ON (c.clntid=t1.clntid) AND (t1.skillid=34 )
> >WHERE ( ( s.POSTCODE STARTING WITH "MK" ) )
> >
> >169 matches IN *139secs* :(
> >(561K skills, 84k clients, 31k sites, 26k companies)
> >
> >PLAN JOIN (T1 INDEX (FK_CLNTSKILL1),C INDEX (PK_CLIENT),S INDEX
> >(IDXSITEPOSTCODE,PK_SITE),CO INDEX (PK_COMPANY))
> >
> >This query appears to use all indices but first time it executes but
> >it takes much longer than it ought.
> >
> >Once it is in the cache it takes ~2s but that is not much good to the
> >end-user plus if you change the postcode to another one it takes more
> >time than expected.
> >
> >Two Questions:
> >1. Any ideas why it takes as long as it does?
>
> 561K skills, 84k clients, 31k sites, 26k companies - that's a pretty
> big catchment for a 4-way join, so there's a stronger than usual need
> to keep the execution environment optimal.
>
> Understand that the cache is used to store previously accessed pages
> and indexes. Do make sure you have a sufficiently large cache of
> physical RAM on hand to keep that many pages useful. A cache that is
> constantly being paged out to virtual memory because it's too small,
> or because physical RAM is insufficient, is not terribly useful.
>
> I don't know whether you made a typo in quoting your statement..but
> check your code to make sure that your text search arguments are in
> single quotes, not double quotes, just to be sure God is in His heaven.
>
> And, because this is all inner joins, I'd shift all search arguments
> into the search clause and test whether that improves the prepare time.
>
> >2. Can I fill the cache ahead of time but issuing a clntskills query
> >of some sort that would help with other clntskill requests?
>
> You should always try to parameterize any static queries. Prepare
> once, run many. Preparing the statement every time it runs is
horribly costly.
>
> SELECT c.clntid, c.surname, c.forename, c.title, c.siteid, c.email ,
> co.compname
> FROM Client c
> JOIN Site s ON c.siteid=s.siteid
> JOIN Company co ON s.compid=co.compid
> JOIN clntskill t1 ON c.clntid=t1.clntid
>
> WHERE s.POSTCODE STARTING WITH ?
> AND t1.skillid = ?
>
> Preparing the parameterized query in advance, , e.g. at application
> startup, would remove the need for it to be prepared the first time
> the user calls it. This isn't something you can do using your
> current static query, since you are forcing a prepare every time the
> statement runs.
>
> You could test whether the extra hit at startup time was an
> improvement on getting the hit the first time the application submits
> the query. The cache won't be populated by preparing, of
> course. Someone on the network is always going to take the "hit" for
> the empty cache.
>
> If these tables are subject to frequent maintenance and you're not
> doing anything currently to housekeep indexes, now might be a good
> time to start.
>
> ./heLen
>