Subject Data Model [ was: Well, here we go again]
Author Jim Starkey
unordained wrote:
> The discussion so far has been focused on your physical (scaling, lag) goals, but you also seem
> interested in some logical (model, datatype, interface) improvements over existing RDBMS's. Is that
> something you're also interested in discussing here, or would you (or other list members!) rather
> avoid turning this into a feature-request-fest? If you're going to start from scratch anyway ...
>
>
An issues that comes up with some frequency is the problem of
representing type hierarchies in a relational database. An example from
the Interbase days, for example, was a customer (with money to spend,
mind you) who needed to track trouble tickets for a complex network.
The problem was that there were many types of trouble -- some hardware,
some software, some human -- and although all trouble shared some common
attributes, different types of trouble had lots of different
attributes. Putting the tickets in different tables made it very
difficult to handle generic trouble, requiring all tickets to have all
attributes was a programming nightmare, and having a single table of all
tickets joined to tables containing trouble specific data was the union
of the worst characteristics of each.

There is a relative obscure beast call the semantic data model invented
by Dennis McLeod and Mike Hammer in 1976. The basic idea is that the
"create table" statement is extended to support a "extends <tablename>"
clause, creating a sub-table. Records stored in sub-tables
automatically appear in respective base table(s). But the really nice
thing about the semantic data model is simplicity: A single DDL clause,
no change or extensions to the DML, API, or protocols.

Here's a short example illustrating the principle that "shirts aren't
pants but both are clothes":

Cloud> create table clothes (sku int, description string, color string)

Cloud> create table shirts (collar_size int, sleeve int) extends clothes

Cloud> create table pants (waist int, inseam int) extends clothes

Cloud> insert into shirts (sku,description,color,collar_size,sleeve)
values (1, 'Dress Shirt', 'white', 16, 34)

Cloud> insert into pants (sku,description,color,waist,inseam)
values (2, 'Jeans', 'black', 40, 30)

Cloud> select * from clothes

SKU DESCRIPTION COLOR
---- ------------ ------

1 Dress Shirt white
2 Jeans black


Cloud> select * from pants

SKU DESCRIPTION COLOR WAIST INSEAM
---- ------------ ------ ------ -------

2 Jeans black 40 30


Cloud> select * from shirts

SKU DESCRIPTION COLOR COLLAR_SIZE SLEEVE
---- ------------ ------ ------------ -------

1 Dress Shirt white 16 34

Cloud>

Type hierarchies show up all over the place.

It was my intention to add semantic extensions to Rdb/ELN in 1983, but
it got lost in the compatibility brawl with Rdb/VMS (basically
sacrificed for blobs). I did a layered implementation for Interbase
that defined a single base table with a view for each sub-table. Cute,
not not quite enough.

Intelligently designed, the semantic extensions are so close to free in
terms of execution overhead that they slip into the category of
no-brainer. There are lots of ways to make the implementation difficult
and expensive (Postgres found some of them), but the implementation in
Nimbus (called Cloud above) took a couple of days with time off to eat,
drink, and sleep.

There are other parts of what is loosely called the object-relational
model that may also make sense, but none are as simple and powerful as
the semantic data model.

--
James A. Starkey
President, NimbusDB, Inc.
978 526-1376