Subject | Re: [firebird-support] Why does DISTINCT does not use the index in this simple case? |
---|---|
Author | Ann Harrison |
Post date | 2012-05-16T15:52:21Z |
On Wed, May 16, 2012 at 5:05 AM, Marc Bettex <bettex@...> wrote:
record, or that it takes 0ms to get the whole result set. There is a
difference. An index will always be the fastest way to get the first row,
but may be measurably slower than a table scan plus a sort when you're
returning a large number of record. Index scans tend to jump from page to
page, reading a record here and a record there, while a table scan reads
each page only once.
the indexed access faster because it does not leap from page to page
randomly. It also makes the sort slower - the Firebird sort algorithm is
not very good when data arrives sorted.
at the data because a single row may have several index entries,
representing values from different versions of the record. But Firebird
can certainly use an index for grouping, provided it also checks that the
record versions themselves are appropriate for the running transaction.
problem of your real query.
WHERE [...]
ORDER BY [...] ROWS x TO y.
and one to put the rows in order so you can pick a subset wit the ROWS x TO
y. When using an index for navigation, Firebird can use only one
index(see below). So it must choose between using all the indexes that can
help reduces the intermediate product based on the WHERE clause, or walking
a single index to match the ORDER BY clause. It will never be able to walk
an index that matches the DISTINCT clause. Maybe with a complete
description of the query and the table and index definitions, someone will
be able to suggest a better way to express the query.
Good luck,
Ann
Normally, Firebird uses indexes to select data rather than to order it.
When operating in that mode, it creates a bitmap of record numbers that
match in one index, then if there is another index that can be used, it
creates a bitmap of those record numbers, and so on for all indexes that
can provide significant subsetting of the data. Firebird then combines the
bitmaps and retrieves the record in record numbers order.
[Non-text portions of this message have been removed]
>When you say "complete" do you mean that ti takes 0ms to get the first
> I'm using Firebird super server 2.5.1 on windows 7. I've got a database
> with a single table with one column filled with 250000 rows. This single
> column is the primary key, which implies that it is indexed and unique and
> contains ids from 000001 to 250000.
>
> The statement "SELECT id FROM table" takes around 0ms to complete.
>
record, or that it takes 0ms to get the whole result set. There is a
difference. An index will always be the fastest way to get the first row,
but may be measurably slower than a table scan plus a sort when you're
returning a large number of record. Index scans tend to jump from page to
page, reading a record here and a record there, while a table scan reads
each page only once.
> The statement "SELECT id FROM table GROUP BY id" also takes around 0ms toAgain, do you mean the time to the first record or the time to the last?
> complete. The plan for this query uses the index of the primary key.
>
> The statement "SELECT DISTINCT id FROM table" takes around 500ms toDo you store your records in id order? You probably do, and that makes
> complete. The plan for this query does not use the index of the primary key.
>
the indexed access faster because it does not leap from page to page
randomly. It also makes the sort slower - the Firebird sort algorithm is
not very good when data arrives sorted.
>Probably not a factor here.
> I've seen this post
> http://tech.groups.yahoo.com/group/firebird-support/message/100051 which
> is quite similar, but I did not really understand the answers. Someone said
> to make the database in the 3rd normal form but I don't really see how
> database normalization is related to DISTINCT.
> Two other people said that rows might have different values for differentThey were confused. You can't use an index to count rows without looking
> transactions, and thus the index cannot be used.
at the data because a single row may have several index entries,
representing values from different versions of the record. But Firebird
can certainly use an index for grouping, provided it also checks that the
record versions themselves are appropriate for the running transaction.
> But if that's the reason, then how could the index be valid for the GROUPMaybe you can force a plan, but solving the simple case will not solve the
> BY clause, or any other part of any other SQL query?
>
> If I'm mistaken regarding the answers of the other post, could someone
> give me an explanation? Does anybody have an idea of why Firebird does not
> use the index in my simple query and if it is possible to make Firebird use
> it?
>
problem of your real query.
> I know that the last request is stupid, but my rather complex request thatFROM table1, table2, table3, [...]
> I'm trying to optimize also takes around 500ms on 250000 rows of data and
> is something like
> SELECT DISTINCT id, col1, col2, col3, [...]
WHERE [...]
ORDER BY [...] ROWS x TO y.
> If I remove the DISTINCT from my query, the execution time drops to 1ms.You've got two separate sorts in that query, one to find the distinct rows,
> So it seems to me that the bottleneck is the DISTINCT and if I can't
> optimize this simple query, I don't see how it would be possible to
> optimize my more complex query.
>
and one to put the rows in order so you can pick a subset wit the ROWS x TO
y. When using an index for navigation, Firebird can use only one
index(see below). So it must choose between using all the indexes that can
help reduces the intermediate product based on the WHERE clause, or walking
a single index to match the ORDER BY clause. It will never be able to walk
an index that matches the DISTINCT clause. Maybe with a complete
description of the query and the table and index definitions, someone will
be able to suggest a better way to express the query.
Good luck,
Ann
Normally, Firebird uses indexes to select data rather than to order it.
When operating in that mode, it creates a bitmap of record numbers that
match in one index, then if there is another index that can be used, it
creates a bitmap of those record numbers, and so on for all indexes that
can provide significant subsetting of the data. Firebird then combines the
bitmaps and retrieves the record in record numbers order.
[Non-text portions of this message have been removed]