firebird-support - Re: Slower fetchall after creating indices

Subject	Re: Slower fetchall after creating indices
Author	Adam
Post date	2006-07-03T12:11:29Z

--- In firebird-support@yahoogroups.com, "jasperelzinga" <jasper@...>
wrote:

>
> Hello,
>
> Im having a strange problem after i create indices.
>
> Normal situation:
> >>> timeExecute("SELECT Instantie.id FROM Instantie WHERE
> Instantie.deleted=0 ORDER BY Instantie.telefoon ASC", [])
> execute time: 0.56299996376
> >>> a = timeFetchall()
> fetchall time: 0.140999794006
>
> This selects 10.000 records, sorted on column 'telefoon' which has no
> index.
>
> Create an index on telefoon:
> >>> cur.execute("CREATE INDEX temp ON Instantie (telefoon)")
> >>> con.commit()
> >>>
>
> And now i execute the exact same query:
> >>> timeExecute("SELECT Instantie.id FROM Instantie WHERE
> Instantie.deleted=0 ORDER BY Instantie.telefoon ASC", [])
> execute time: 0.0
> >>> b = timeFetchall()
> fetchall time: 0.65700006485
>
> The execute time has been reduced a lot, that is what i'd expected.
> But.. the fetchall time is now more than half a second!
>
> Ok.. and to be sure, check if the results are te same:
> >>> a == b
> True
>
> Im really puzzeled by this situation. The fetchall method just returns
> the data (a list of tuples), which is exactly the same in both
> situations. I can't think of any reason why it could slow down.

In the first query, Firebird chooses to read the table in natural
order (in the order it is stored on disk). It then sorts the data in
memory, or using temporary files if it runs out of space in memory.

As you would expect, it takes a while to read and sort everything, but
once sorted it can return all the records quickly.

In the second case, you now provide Firebird with an index it can use
to do the sorting. Therefore no time is taken to do the sort. However
you are now reading from the table out of storage order. Because of
this, the read write head of the hard drive will jump around like
crazy, and fetching each record is more expensive.

It is a trade-off. Generally, if you are returning a lot of records,
an indexed sort is going to be slower. If you are returning only a few
records, the indexed sort is going to be significantly faster.

Change the query to

SELECT
FIRST 10
Instantie.id
FROM Instantie
WHERE Instantie.deleted=0
ORDER BY Instantie.telefoon ASC

Comparing the results, I would expect without the index it would take
over 500ms, but with the index would only be 1 or 2 ms.

> There
> are 10.000 records in the table and it has 55 columns, with 'id' as
> primary key with an index on it.
>
> What could be the reason and what could i do to fix this problem?

You have observed the difference in implementation between a natural
read and sort and an indexed read. What is the problem you are
referring to? if A==B then there is no problem, just a fast and a slow
method of coming up with the tuples.

Adam