Subject Re: [ib-support] Adding index does nothing to select disctinct
Author Mark Patterson
Helen Borrie wrote:
> At 10:39 AM 13-09-02 +1000, you wrote:
>influence the speed very much. What is your WHERE clause like? If you add
>>
>>OK, what you're saying is that SELECT DISTINCT doesn't know how to take
>>advantage of indexes. Maybe that can be an enhancement for version 2.0.

> It's hard to see how it could, since SELECT DISTINCT has to get a distinct
> occurrence of all the columns named in the output set. The logic of that
> involves intermediate sets. Even in your case, where you have only one
> column, it still has to walk the whole table to eliminate the duplicates,
> because you have no limiting clause.

That depends how the index works. I don't know the mechanics of it, but I sort
of assumes there was some logical connection between indexes and sorting, and if
you have a sorted list with duplicates you don't need to walk the whole list.

> I avoid SELECT DISTINCT if speed matters, and structure the table and/or
> the query so that I can use a WHERE clause. In your example, I'd certainly
> have a JOB table where JOB_NUMBER is unique.

Sounds sensible, but not in harmony with the general idea of normalization. That
is looking like the way we'll go.

> But if you can't do that, at least limit the rows by some means (e.g. date)
> so that your query has a smaller set to walk.

Hmmm... yes, if the user enters the date range, and there is an index on date
range, before they can choose a job_number, the list could be made available
when they start entering the job_number. But if if still takes 8 seconds,
because they want a report over the last 2 years...

thanks again.

mark