Subject | Re: [Firebird-Architect] FB 2.0 Road Map |
---|---|
Author | unordained |
Post date | 2004-09-09T18:13:30Z |
---------- Original Message -----------
[Reference-ish material for the debate concerning doing away with NULL]
To support Chris Date's view of missing information (in The Third Manifesto, with Hugh Darwen),
domains would have to be very user-extensible. You'd need to be able to say that your new
domain 'int_or_unknown' supported all integer values as well as a special value '?', or whatever
might be your liking (and possibly several, to indicate which reason is being given for a missing
value.) It would be preferable to also have named constants available for all these extra values.
Date would be in favor of letting users define their operators as well, and has said it would be
nearly impossible to disallow users from recreating the '?' != '?' rule if they really wanted it.
He does say he would rather that a value always be equal to itself (NULL, '?', and anything else
included) for fairly obvious reasons. Perhaps he would be satisfied by two different equality
operators -- one which can't be overridden, and one which can? Without resolving this question, the
proposal merely shifts the NULL problem into the user's lap rather than the dbms's -- but maybe
that's all they care about? Let the user shoot himself in the foot with NULL, or not, as he pleases?
This also requires domains to be separated from representations; you'd need to evaluate the domain
of an expression by testing its constraint only, not by looking at whether it's internally an
integer or a varchar or a blob. As I understand it, Firebird currently assumes that constraints
just further reduce the number of allowed values in a domain, but domains are always constrained to
their underlying physical data layer (int, varchar, etc.) There are some neat features to be had
here, but also an extra cost at each data-validation point throughout the code.
Fabian Pascal has been more in favor of telling people to normalize their data to the point of
never having a NULL anywhere in the base tables, but hasn't been great at explaining what this does
to joins. Whereas Date and Darwen say that join operators should let you specify what value you
would like in place of missing information (such as the '?' value above -- basically requiring you
to use COALESCE everywhere), Pascal had something really awkward to say in his "Practical Issues in
Database Management" book, page 234: "Table operations would also have to be modified to yield
results with as many tables as there are types of propositions with only known values." (His
example join results in two tables, "with the DBMS aware of the relationship between them for the
purpose of further manipulation".) This contradicts Date's statement that all relational operations
should give you exactly one relation/table as a result; when asked, Pascal was quite unwilling to
explain how his statement could be reconciled with Date's.
-Unordained
> Well, C. J. Date himself later changed his mind------- End of Original Message -------
>
> (http://www.amazon.com/exec/obidos/tg/detail/-/0201543036/ref=ase_ffesoftwareinc/002-
> 1731702-2750422?v=glance&s=books); I am inclined to agree with his later optinions.
[Reference-ish material for the debate concerning doing away with NULL]
To support Chris Date's view of missing information (in The Third Manifesto, with Hugh Darwen),
domains would have to be very user-extensible. You'd need to be able to say that your new
domain 'int_or_unknown' supported all integer values as well as a special value '?', or whatever
might be your liking (and possibly several, to indicate which reason is being given for a missing
value.) It would be preferable to also have named constants available for all these extra values.
Date would be in favor of letting users define their operators as well, and has said it would be
nearly impossible to disallow users from recreating the '?' != '?' rule if they really wanted it.
He does say he would rather that a value always be equal to itself (NULL, '?', and anything else
included) for fairly obvious reasons. Perhaps he would be satisfied by two different equality
operators -- one which can't be overridden, and one which can? Without resolving this question, the
proposal merely shifts the NULL problem into the user's lap rather than the dbms's -- but maybe
that's all they care about? Let the user shoot himself in the foot with NULL, or not, as he pleases?
This also requires domains to be separated from representations; you'd need to evaluate the domain
of an expression by testing its constraint only, not by looking at whether it's internally an
integer or a varchar or a blob. As I understand it, Firebird currently assumes that constraints
just further reduce the number of allowed values in a domain, but domains are always constrained to
their underlying physical data layer (int, varchar, etc.) There are some neat features to be had
here, but also an extra cost at each data-validation point throughout the code.
Fabian Pascal has been more in favor of telling people to normalize their data to the point of
never having a NULL anywhere in the base tables, but hasn't been great at explaining what this does
to joins. Whereas Date and Darwen say that join operators should let you specify what value you
would like in place of missing information (such as the '?' value above -- basically requiring you
to use COALESCE everywhere), Pascal had something really awkward to say in his "Practical Issues in
Database Management" book, page 234: "Table operations would also have to be modified to yield
results with as many tables as there are types of propositions with only known values." (His
example join results in two tables, "with the DBMS aware of the relationship between them for the
purpose of further manipulation".) This contradicts Date's statement that all relational operations
should give you exactly one relation/table as a result; when asked, Pascal was quite unwilling to
explain how his statement could be reconciled with Date's.
-Unordained