Subject | Re: [firebird-support] Optimization Question |
---|---|
Author | Jakub Hegenbart |
Post date | 2004-02-19T16:36Z |
> The Table:And what about this one:
> CREATE TABLE SPREADACTIONTABLE
> (
> SPREADACTION_ID BIGINT NOT NULL,
> SPREAD_ID BIGINT NOT NULL,
> ACTION_TYPE SMALLINT NOT NULL,
> STATUS SMALLINT,
> SPREAD_TIME INTEGER NOT NULL,
> ACTION_TIME INTEGER NOT NULL,
> CREATION_TIMESTAMP TIME_STAMP ,
> CONSTRAINT SPREADACTIONTABLEPRIMARYKEY1 PRIMARY KEY (SPREADACTION_ID)
> );
>
> CREATE ASC INDEX SPREADID_INDEX ON SPREADACTIONTABLE
> (SPREAD_ID, ACTION_TIME, SPREADACTION_ID);
>
>
> The select statement:
>
> select first 1 SpreadAction_ID from SpreadActionTable
> where Spread_ID = 9007 AND ( Status IS NULL OR Status != 2 )
> AND Action_Time >= 1073030400
> AND Action_Time <= 1075535999
> Order By Action_Time;
>
> The plan used for this was: PLAN SORT ((SPREADACTIONTABLE INDEX (SPREADID_INDEX)))
>
> Why does the execution plan include a sort? Since it has an ascending index on field ACTION_TIME, and I have requested "first 1", should it not simply read the index until it finds the first ACTION_TIME within the specified range, grab the corresponding table row, and be done?
>
> Sometimes this query executes instantly, but many other times it takes several seconds to do: those are apparently the times when there are lots (thousands) of rows with the specified SPREAD_ID. Apparently Firebird is retrieving all the rows, sorting them, and only then delivering the first row to me.
select first 1 SpreadAction_ID from SpreadActionTable
where Spread_ID = 9007 AND ( Status IS NULL OR Status != 2 )
AND Action_Time >= 1073030400
AND Action_Time <= 1075535999
Order By Action_Time
PLAN (SPREADACTIONTABLE ORDERED SPREADID_INDEX);
In theory, Firebird could perform an index scan and directly filter out messages with SPREAD_ID=9007 AND ACTION_TIME>=1073030400 AND ACTION_TIME<=1075535999, since you have an compound index which starts with these two columns. After that, it should start fetching rows from the data pages and stop on the first row that satisfies all the conditions. (I'm just guessing how i would perform it on the paper if i was a slow reader (disk access :))
If there's an INDEX (SPREADID_INDEX), it fetches all data that satisfies the (SPREAD_ID=9007 AND ACTION_TIME>=1073030400 AND ACTION_TIME<=1075535999) condition, but when reading the data, the row that you're interested in is less likely to be fetched as soon as possible.
Imagine there was no "FIRST 1". Then, if there was no ORDER BY in your query, then fetching any number of rows would be faster with INDEX than with ORDERED, since INDEX creates the appropriate sparse bitmap and then goes "natural", but only for some data pages. And the same goes _even for ordered result set_ (!), because AFAIK it's usually faster to sort "in memory."
But your case is different, you want the first row of an ordered result set where the ORDER BY field can utilize the same compound index that is used for filtering against SPREAD_ID and it should be IMHO faster to use (SPREADACTIONTABLE ORDERED SPREADID_INDEX) in your query.
I might be terribly wrong (in that case, i'll be glad to learn something new), the benefits are most probably also dependent on the real numbers/statistics. Just try ORDERED (as you probably figured out yourself, there's a slight notion of it in your post) :)
Hope this helps...
Jakub Hegenbart