Subject Re: [firebird-support] TimeStamp Index Question
Author Lee Jenkins
Helen Borrie wrote:

>>I have a query that will be ordering a TimeStamp field descending. Is it
>>generally a good idea to place a descending index on a field like that?
>
>
> A timestamp is a unique moment in time. So, for example, 2008-07-12
> 14:22:45.123 and 2008-07-12 14:22:45.124 are distinct nodes in the
> index. If each continues to live solo in its node then it is unique.
> Unique = highest possible selectivity.
>
> If by "inserted consistently" you mean that the timestamp is read from
> server time, not from client time, then the possibility of its being
> unique is high. "Non-uniqueness" really depends on when and how the
> timestamp is recorded.
>
> As an example, if you're posting and committing the sales transactions
> one by one and using a Before Insert trigger to write the timestamp,
> then the chance of a duplication is minute - another terminal writes a
> transaction within the same millisecond. On the other hand, if you're
> gathering up all the transactions once a month and writing
> CURRENT_TIMESTAMP onto that column on all of those records inside one
> execution of a stored procedure, they'll all have the same timestamp
> value. (In the latter example, selectivity would be potentially a little
> better if the SP wrote the timestamp from the value 'NOW'.)
>
> > 'WHERE ' +
> > ' SALE.DELIVERY_STATUS = :pDeliveryStatus ' +
> > 'ORDER BY ' +
> > ' SALE.SALE_DEPART_TIME DESC;';
>
> The SALE.DELIVERY_STATUS column will have extremely low
> selectivity....so the search is going to be slow intrinsically and any
> index on it will get in the way. In your example, you're not selecting
> on SALE.SALE_DEPART_TIME so its indexes aren't involved in the search.
> However, because you *need* the descending order, the DESC index could
> be useful if the output set is large. Test it with and without, with a
> "midstream" cache in both cases, and see whether it makes any
> difference. That's the bottom line, always.
>
> ./heLen
>

I did a little testing with relatively few records (300K) in the sales table.
It was quite slow and I believe I found the reason:

PLAN SORT (JOIN (POS_USER NATURAL, SALE INDEX (SALE_USER_OID), CUSTOMER INDEX
(PK_CUSTOMER)))

Although my test database has only a few user records defined that can be used
to enter sales, in practice, there still would not be that many user records (in
a real, in the field database) compared to the number of sale records there will
be. Selectivity in my test database was like 0.33333.., LOL. But it was a good
eye catcher since I don't believe that the few hundred records that would be in
the POS_USER table in a production db would help with selectivity anyway.

Instead, what I did was create an additional index:

CREATE INDEX SALE_SALEDATE_USEROID ON SALE (SYSTEM_DATE, USER_OID);

Then I added the system_date field to the WHERE clause and the query returned
instantly using the following plan:

PLAN SORT (JOIN (POS_USER NATURAL, SALE INDEX (SALE_USER_OID,
SALE_SALEDATE_USEROID), CUSTOMER INDEX (PK_CUSTOMER)))

I assume that since the POS_USER table would have at most, a few hundred records
in it, the natural sort is not that big of a deal. But the use of the
SALE_SALEDATE_USEROID index seems to have made the difference.

--
Warm Regards,

Lee