Subject Re: [firebird-support] indexing a status column
Author Arno Brinkman

> Background: It has been established that indexing a boolean (yes/no,
> 1/0) column is not recommended in Firebird for performance reasons (i.e.
> update/insert speed, confusing the optimizer).

> My situation: I have a large table with a 'status' column. This table
> has an INTEGER primary key fed from a generator. The 'status' column is
> also an integer field, but is limited to five values ([1..5]) at the
> moment. The distribution of these five values are not even, with the
> value '1' expected in more than 60% of the rows. The rest of the values
> are divided more or less evenly.

> At some point in the near future i'll need to write some
> queries/storedproc that will only process records of a certain 'status'
> value. Without an index, this operation will normally require a full
> table scan. This makes me a bit nervous.
> Will it be a good idea to either :
> a. create an index on the 'status' column ?

I think in your situation yes, but how much/often are you going to filter on
the whole table for '1' (60% value) ?

> b. create a compound index on 'status' + 'pk' columns ? (supposedly this
> will remove the 'long index chain' caused by [a] above. In a cursory
> test with a small dataset, "select * from table where status=1" does
> indeed use the index.). Is this the best thing to do?

If you filter most on ('status' and PK) or ('status' where the value is a
little range of the whole table).

When only creating an index on 'status' and you've a query like this :
SELECT * FROM TableA WHERE PK # 1 and Status = 1
# is (=, >=, <=, >, <)
Both the PK index and the 'status' index will be used, which will result in
many index-page reads.

With a index on (status, PK) only 1 index will scanned and limited by both

> c. split the table into five different tables with identical structure,
> and remove the 'status' column altogether ? (I don't like the idea at
> all, but someone suggested this, so i'll ask for second opinions).

If you go for speed this is probably the fastest one, but i don't like it
and try to avoid it (maintenance).

In 1 system i've split a table into two tables (active and history) to keep
the active table fast. The orders come into the active table and the
'status' values in it are changed "rapidly", but for these records there's
always a fixed point when they go to the history table.

Arno Brinkman

Firebird open source database (based on IB-OE) with many SQL-99 features :

Support list for Interbase and Firebird users :

Nederlandse firebird nieuwsgroep :