Subject Re: [IB-Architect] Data clustering
Author Steve Landrum
This weekend I built a program that will move data from
a full database into an empty database made from the same
metadata. The move program used a table to identify the
table name move order(to accomidate dependent tables) and
an order statement.

The database I reordered is 177M the table with the most rows
is an owner table with 100,000 peoples names.

Data clustering does help query speed, in some cases up to 50%
in others not at all. The largest improvement was when the
search was on the clustered field and this was the first query
after the database connection was made. Subsequent queries were
faster though not as dramatic.

For one example
SELECT * FROM OWNERS WHERE LAST_NAME STARTING WITH 'SMITH'

Non Clustered:
Execute time: 900ms
Page Reads: 45
Records: 1490

Clustered:
Execute time: 625ms
Page Reads: 23
Records: 1490

Data clustering seems to work best where the clustered field is
the lone select field such as order details. When other fields
are added to the query the speed improvement is less impressive.

I am in the process of clustering a 2G database to test some
stored procedures on larger databases.

One question about the backup process. Lets say there is a table
with 10M rows and an unindexed field is chosen for clustering.
What happens to backup speed?
How much temp diskspace would be chewed up or if there was not
enough temp space on the disk would the backup abort?

My vote is for an extension to the system tables to force the
use of an index on the sorted fields and to not change the
backup restore process.

Steve