Subject Re: [firebird-support] Optimizing SELECT DISTINCT ..
Author Ann W. Harrison
Harri Vartiainen wrote:
>
> What would be best way to optimize "SELECT DISTINCT FIELD1 FROM
> TABLE1 ORDER BY FIELD1" query?
>
> I have one table with about 200000 rows and 2000 distinct values,
> query takes on production system 2-5 seconds, according to the query
> plan the whole table is read. I have index defined on FIELD1, but it
> doesn't help in this case.
>
> I'm running FB 2.1.1

First, Firebird is a multi-version database with non-versioning indexes.
That means that FIELD1 of record A may have different values for
different transactions, and all values will be in the index. To
determine whether a value of FIELD1 is visible to your transaction,
Firebird has to read the record, not just the index.

So, a "select distinct FIELD1 from TABLE1 order by FIELD1" has
to read every record in the table. The fastest way to read every
record is to start with the first page in the table and read through
to the last page in the table.

"Ah", you say, but that requires a sort and reading is Kn while
sorting is K n log n. But the K involved in reading is vastly
larger than the K in a sort, so in almost all cases it's faster
to read records in their natural stored order and sort them rather
than rattle the disk to death picking out records in index order.

Your query may be faster if you drop the "order by" clause.


Good luck,

Ann