Subject Index Selectivity Question (RFC)
Author Joseph Alba
I would just like to request for comments on my article below so that I
will know if my views are correct or not. This is with regards to
whether we will just continue to blindly follow the AVOID INDEX
SELECTIVITY mantra, or actually harness the extreme power of Interbase
sparse indexes which is actually optimized for INDEX SELECTIVITY - which
is the reverse of AVOID SELECTIVITY.

I also invite others to confirm my findings. So, here goes:

----------------

In the Interbase community, newbies are often advised to avoid index
selectivity - which means, do not index on keys which have too many same
values.

Index selectivity is bad because it could mean extremely slow user data
entry and worse, never ending RESTORE operations.

But I remember, during a discussion with Jim Starkey regarding indexes
around one or two years ago, that selectivity is not THE cause of the
problem per se.

I did not understand the explanation then, but I kept on with my good
practice of avoiding index selectivity by making my index keys as unique
as possible.

----

With the release of Oracle 8, Oracle annouced a new and advanced type of
index which they called bitmapped. This is in addition to the usual old
style dense B-Tree indexes that the older versions had, and which most
of the other databases use.

You can create this type of index by adding a new BITMAP keyword to the
usual CREATE INDEX command.

Their description of the Bitmap Index characteristics was very much like
the Interbase index characteristics.

But they took care to admonish would be users of this type of index that
this is NOT SUITABLE for TRANSACTION tables which are used to accept
user data entries (UPDATE, DELETE, INSERT data manipulation operations).

This type of index is suitable for ARCHIVE or analysis tables which
accepts BATCH inputs and are heavily used for SELECT data access
operations.

When doing any INSERT, UPDATE, or DELETE operations, this sequence
should be followed:

1. Turn Indexing off (INACTIVE)
2. Do the batch INSERT, UPDATE, DELETE operations
3. Turn Indexing on (ACTIVE)

---

Today, I decided to bite the bullet and apply this advise to my
Interbase database.

I have an archive table of 2,500,000 records with a BILL_TYPE column
with only three possible values, ('C','I', and 'B'), and an AREACODE
table with only fifteen possible values plus possibly NULL.

This archive table BILL accepts batch inputs from transaction table
NEWBILL.

The following commands below took under 4 minutes each to finish.

CREATE INDEX BILL_BILLTYPE ON BILL(BILL_TYPE)
CREATE INDEX BILL_AREACODE ON BILL(BILL_AREACODE)

So, no proplems associated with Index selectivity occurred.

When I did a SELECT:

SELECT * FROM BILL WHERE BILL_TYPE='I' AND AREACODE='01'

It was able to show a plan of (BILL_TYPE,BILL_AREACODE).

The response time was also extremely fast, returning the 5 row answer
(in the 2.5 million row table) in less than a second.

CONCLUSION:

Index selectivity is not bad if you turn off all indexes with
selectivity characteristic before doing any INSERT, UPDATE, and DELETE
operation, and turning it ON again.

Index selectivity is VERY GOOD for SELECT operations.

RECOMMENDATION:

Interbase needs a new type of index is suitable for the USUAL on-line
user data entry operations of INSERT, DELETE, and UPDATE.

This new type of index will actually be the old B-TREE type which is non
bit-map or in technical terms, DENSE.

It is ironic that what Oracle boasts as a new type of BITMAP (SPARSE)
index, Interbase had for years already.

But the ordinary dense type index that all the usual run-of-the-mill
database have, Interbase lacks.

This handicap forces Interbase developers to use multi-key indexes which
cannot be joined and creates problems for DATA ENTRY and RESTORE
operations to the unwary and uninitiated.

It would be good for Interbase to have dense type B-TREE indexes
available for Transaction / data entry tables.


BEST PRACTICE:

#1 For large sized transaction tables where most operations are INSERT,
DELETE,and UPDATE, avoid index selectivity by using multi-key indexes
with low row cardinality. (AVOID INDEX SELECTIVITY mantra)

#2. For large sized archive table where most operations are SELECT, use
SINGLE KEY indexes so that Interbase can combine these indexes for
optimal performance.

Do only batch updates on these tables. But before doing batch updates,
(or restore), take care to ALTER INDEX to INACTIVE. After doing the
batch updates, that's the time to do ALTER INDEX to ACTIVE.

#3. For small sized transaction tables (under 100,000 rows) there
appears to be no selectivity issue, so, it is better to use single key
indexes.

NOTE:

To those who saw this post in the Interbase list, Paul Beach requested
me to re-post this article on IB-Architect list. I just hope that BBW
would not whop off my head this time around.