Subject Re: [firebird-support] SELECT statement runs at different speeds
Author unordained
---------- Original Message -----------
From: "red_october2009" <kevin.wendy.morris@...>
> I have the following statement:
>
> SELECT
> f.PAR_GUID,
> MAX(DATEDIFF(DAY, f.EVT_DT + f.EXP_RSP_DYS, CURRENT_DATE))
> FROM
> JET_LTR f
> WHERE
> (f.EXP_RSP_DYS > -1) AND
> (f.RSP_RCVD_DT IS NULL) AND
> (DATEDIFF(DAY, f.EVT_DT + f.EXP_RSP_DYS, CURRENT_DATE) >
> f.EXP_RSP_DYS)
> GROUP BY 1
>
> The DB is inactive all night. Each morning, when I run this for the
> first time that day, it will take 40 seconds to complete. However,
> without any changes to the DB, if I run it subsequent times the same
> day, the subsequent times it will complete in less than half a second.
>
> What can I do, to make this statement run in sub-second time, always,
> including the first time it's run? Also, would using a computed
> field help the speed at all (for the MAX() function).
------- End of Original Message -------

a) It could be the cache. Do you restart FB during the night?

b) It could be sweeping the data: do you do bulk-loads during the night? Are
there long-running transactions during the day that would prevent sweeping from
occuring, and they all get committed / rolled-back at night (when everyone
disconnects)? First thing in the morning might be the first time FB gets a
chance to clean up old record versions.

c) Do you have any other jobs running first thing in the morning, that could be
impacting performance right then?

d) Can we see the current PLAN? I assume PAR_GUID is already indexed? Do you
have any multi-field indices, that could help with those AND's? (Might not make
sense, depending on selectivity.)

e) If you were to have
ALTER TABLE jet_ltr ADD response_by COMPUTED BY (evt_dt + exp_rsp_dys);
CREATE DESC INDEX jet_ltr_expected ON jet_ltr (response_by);

You could use the following in your query, which should be able to use the
index:

... WHERE ... AND (jet_ltr_expected < current_date) AND (rsp_rcvd_dt IS NULL)

Potentially, you could short-circuit it further with this:

... late_if_after COMPUTED BY (CASE WHEN rsp_rcvd_dt IS NULL THEN evt_dt +
exp_rsp_dys ELSE NULL END);

... desc index ... (late_if_after);

... WHERE late_if_after < CURRENT_DATE /* checks lateness & no-response all in
one operation */

And it would maybe help? It depends on how much stuff is usually beyond the
expected-response-date, as to which side is most expensive: the grouping, or
the filtering. I would assume that things usually flow smoothly, and it's the
filtering that's expensive, with a tiny sort/group (for the max) at the end.

-Philip