Subject Scanning Entire Table
Author Dan Wilson
In firebird-devel recently, there has been a discussion about the relative efficiency of using an index versus reading all records in physical order and then scanning them. The statement was made that when reading the entire table it is always more efficient to read the table in physical record order, then sort it than it is to use an index, thus reading records in random-order. This discussion was very relevant to me because I am in the process of writing a program that will have to scan through two very large tables, looking at each record in a particular order. To do so, I have just created a new index that expresses the particular sort order I need.

My question is this: is the above statement about relative efficiency true regardless of table size? In my particular case, one of the tables to be scanned has over 220 million records, and the second table has about 100 million records. I had assumed that in such a case, attempting an on-the-fly sort would be very very expensive and thus creating the index would be the better way to go. Am I wrong?

Thanks,

Dan.