Subject Re: [firebird-support] How to Speed up query containing Distinct
Author Helen Borrie
At 09:09 PM 23/09/2003 +0000, you wrote:
>Hi,
>
>Is there any way to speed up this query:
>
>SELECT DISTINCT kiosk_id
>FROM Metrics WHERE ExecTime IS NOT NULL AND
>Copies IS NOT NULL
>
>The Metrics table has about 500,000 records in it,
>and an index
>which includes the kiosk_id, ExecTime and Copies. I would have
>expected it to use this, but the plan that the Analyzer shows is a
>sort:

You have two indexes involving kiosk_id. Which one would the optimizer use?


>PLAN SORT ((METRICS NATURAL))
>
>The full definition of the table and indexes is below.
>
>Many thanks,
>
>David
>
>CREATE TABLE metrics (Kiosk_Id INTEGER, Form_Id INTEGER, Copies
>INTEGER, Pgs INTEGER, Script_Id INTEGER, ExecTime TIMESTAMP, Id
>INTEGER NOT NULL, Primary Key(Id));
>
>CREATE GENERATOR GEN_METRICS_ID;
>
>SET TERM ^ ;
>
>CREATE TRIGGER METRICS_BI FOR METRICS
>ACTIVE BEFORE INSERT POSITION 0
>AS
>BEGIN
>IF (NEW.ID IS NULL) THEN
> NEW.ID = GEN_ID(GEN_METRICS_ID, 1);
>END ^
>commit ^
>SET TERM ; ^
>
>
>CREATE INDEX m_kioskId_formId_exectime ON metrics (Kiosk_Id,
>Form_Id, ExecTime);
>
>CREATE INDEX m_kioskId_exectime_copies ON metrics (Kiosk_Id,
>ExecTime, Copies);
>
>CREATE INDEX m_formId_exectime_copies ON metrics (Form_Id, ExecTime,
>Copies);
>
>CREATE INDEX m_scriptId_exectime_copies ON metrics (Script_Id,
>ExecTime, Copies);

Kill all of these indexes and replace them with single-column indexes on
any columns which actually NEED to be indexed. If any of these columns is
a foreign key, don't index it at all, since it already has an index.

As far as Kiosk_ID is concerned, if its selectivity is low (few distinct
occurrences across the entire table) then don't give it a single-column
index. Natural order will be faster. Or, index (Kiosk_ID, ExecTime).

OTOH, if Kiosk_ID has higher selectivity (thin distribution across the
table) then use a parameterised WHERE clause, e.g. on ExecTime, to filter
the output set to a realistic size. A client can't do anything useful with
tens or hundreds of thousands of records.

Incidentally, given that (it seems) Kiosk_ID needs to be in every record
(at least for this query), I wonder why it is nullable.

heLen