Subject Re: [ib-support] Key size too big
Author Helen Borrie
At 11:28 PM 8/06/2003 +0100, you wrote:
>Helen,
>
>did I misunderstand you (as is likely,) or did you suggest that the "rule"
>is that primary keys shouldn't contain any meaningful information?

Yes, this is one of the factors in the "A" piece of "ACID". Atomic.

>i kinda thought it was the other way around ... although our own project
>-does- use generators for primary key values, i've often been reminded
>that in theory, you really should be using some unique property of the
>data itself, in the form of a candidate key, when selecting a primary key ...

Hey, that's not wrong! Those theoretical rules are part of the process you
go through as you normalize your data - you must determine which column(s)
is/are the primary key, i.e. will represent the non-key columns as a unique
set.

But, if you end up with meaningful data in your keys, then your next job is
to look at how you are going to use these structures. If users enter the
data for keys, if the data for keys might change, if the key comes from an
external source, yada, yada, then you need to surrogate the key with a
non-meaningful value that can't be touched.

That way, for a simple example, if Surname + FirstName are your primary
keys in a Person table, you surrogate the key with a generator. Then, it
doesn't matter if Doe + Jane changes to Smith + Jane when Jane gets married.

>the theory also continues, saying that an index, under any form, should
>not be required to enforce uniqueness properties: you should be able to
>define a primary key on any field, regardless of how your index works,

The theory says that any structure of columns in a row which, taken in
isolation, uniquely distinguishes the rest of the data, is a candidate
key. While the theory deals with the patterns which occur in groups of
related data and helps you to recognise candidate keys, it doesn't concern
itself with physical implementation. Theoretically, for example, you
could form a primary key consisting of every column in the table and leave
your implementation to enforce it by reading every column in every row each
time a new row is inserted or any column in the row receives a change.

Some older RDBMS (pre-SQL) rely completely on indexes to form data
relationships. Typically, these materialise as hierarchies of indexes that
are repeated down the line of dependency. If you've ever worked with
Paradox in its native (non-SQL) engine environment, you will be familiar
with this style of implementation.

At the physical level, i.e. beyond theory, the database engine accepts a
CONSTRAINT, to which is applies a set of rules. Once of the rules
associated with a PRIMARY KEY constraint is that it requires an index to
enforce it. Such index itself has certain constraints - UNIQUE and NOT
NULL - so the PRIMARY KEY constraint cannot be applied to the column[s] if
it is not possible to create a unique index on it. As the PRIMARY KEY
constraint is implemented in Firebird, if the constraints for the *index*
are satisfied, then the engine permits the constraint (PRIMARY KEY) and
also creates the INDEX to enforce the constraint.

>as the PK is a logical thing, and the index is a physical thing
>(increasing access speeds.)

Well, it's true that an index can be used to increase access speeds,
although this is always the case. There are conditions where one would
avoid creating a FOREIGN KEY constraint, because the enforcing non-unique
index has such low selectivity that it degrades performance. In other
words, the use of an index to enforce the rules of a constraint is not
about performance, whereas indexing, in other contexts, is.

>the optimizer should, if at all possible, be invisible on the logical level.

Can't see what this has to do with primary key constraints.


>but then, i've been told firebird plans to remain an SQL database, not a
>relational one ...

Good grief!!! A relational database is one which recognises the logical
bindings between sets by means of keys. Firebird has always been a
relational database.

SQL is a query language which surfaces a number of operations on related
sets in a (supposedly) standard way. It doesn't presuppose anything about
what goes on in the database engine to implement the language specification.

Although always a RDBMS, Firebird actually didn't start life as "an SQL
database". Lurking there in Firebird is one of its original query engines,
gdml. Under the hood, it still uses pieces of gdml to build and maintain
metadata.

heLen