Subject Re: [Firebird-admin] (Fwd) RFC: Tutorial D Implementation
Author Leandro Dutra
Should I subscribe to your firebird-admin and ib-architect lists by
myself or will you forward this to them?


> This is indeed an interesting proposal. I think you will have more
> response from the firebird team than the Interbase (Borland) team, and
> indeed you seem to have contacted us.

One thing I still couldn't gather from the website (I haven't
inspected your mailing list archives) is if firebird is indeed a fork of
Interbase, and if so what will be the differences in your objectives and
future versions.



> Before I discuss any technical issues, note that firebird is currently
> under the Interbase public license, specified by Borland,

This can be a thorny issue indeed. As far as I understand, we would
have to either check for MPL-LGPL compatibility (which I'd rather avoid
because LGPL will not protect the users' freedoms as strongly), or get
firebird to use Mozilla's dual licensing, if IPL was based on MPL 1.1. I
believe dual licensing would be the best bet, but then we would need
Interbase collaboration (see http://www.mozilla.org/MPL/MPL-1.1.html, item
13, and
http://www.gnu.org/philosophy/license-list.html#GPLIncompatibleLicenses).
But I still couldn't find a contact at Interbase.

Unfortunately I believe the use of GPL-incompatible licensing is a
much more serious question than usually realized. But let's at least talk
and see what can we do about that.



> It has been a considerable time since I studied Tutorial D,
> and I never
> fully understood it. As I recall, some of the main features were:

I do not claim to have fully understood it also, but as far as I
understood it is a saner language than SQL will ever be able to be. I would
recommend that anyone interested read at least The Third Manifesto
(http://dbdebunk.com/books.htm#cjd9b) and other Chris J Date's books
(http://dbdebunk.com/books.htm). The Third Manifesto is essencial because
it both defines Tutorial D and the reasons behind it, as well as many
implications. There is an early, very succint version of a part of it at
http://www.acm.org/sigmod/record/issues/9503/manifesto.ps.

Even without fully grasping it, I still firmly believe we have no
other way of advancing the fundamentals of the RDBMS fields. This is not to
say that SQL can't be improved from the current situation, be it in the
direction of better implementations be it as dialects more compliant to the
latest standards. But IMNSHO this should be done parallel to, not instead
of, the advancement of the data language itself, and this I believe can't be
done while preserving SQL compatibility at the data language level, so our
best way could be one same back end engine supporting both SQL *and*
Tutorial D as separate front end languages.

That said, let us try to tackle your questions.

> --arbitrary types as domains, corresponding to OO types.

Quite so. I would say it a little different, that it defines its
attributes on domains, an each domain must be defined as a type, that can be
user-defined (is this what you call "arbitrary"?)

As for correspondence to OO types, The Third Manifesto has its own
type inheritance model... the idea is to support the needs of OO programming
without bending the relational model. But I won't be able to explain
everything, you should really get Date's writings instead of relying on my
email explanations.



> --a pure relational language, with relations built of these
> types and all
> relational operations available. (In particular, joins on arbitrary
> domains).

Yes, it is purely relational, but in this relational basis it builds
many additional features orthogonal to the relational model in order to
support OO and other needs.

And while yes, it supports joins on arbitrary attributes, its
emphasis is on natural joins.



> --What is the relationship to the outside world?

I think I didn't understand what do you mean...



> --Are the domain members constrained to be implemented in a particular
> language?

Tutorial D strives to be computationally complete, so we should be
able to define domains in Tutorial D. But I think that if need be it should
be possible, after implementing Tutorial D itself, add support to the
definition of operators and such in different languages. More specifically
the Third Manifesto

It's been a long time since I've studied OO, and now I am quite
unfamiliar with OO jargon. What exactly do you mean by domain members?



> --How are generalized theta joins implemented (eg. where a <
> b generalizes
> to where a.getX() = b.getX(). )

I will have to investigate this one.


> --Are type hierarchies supported?

The Third Manifesto has its own inheritance model, which I won't be
able to explain here. You would really have to buy the book.



> Some architectural advantages of firebird are:
>
> --Already implements domain concept, although not for stored
> procedures.

I don't see how this goes... once you have proper domains, it is a
data definition issue, how can stored procedures not support them?

Perhaps you have in mind the SQL definition of domain. AFAIU, a SQL
domain is just a named SQL simple type definition, not a domain as defined
in theory.



> --Good Blob support ( I imagine most complex domains would be stored
> similarly to blobs, at least if they are of unbounded size)

Not really... the BLOB could be useful when the user want a fast,
dirty generic attribute, but mostly we should strive to make the user
definition of types good enough to be used in most cases.

As for big scalar values, the definition of physical characteristics
of storage should be kept outside of the Tutorial D environment, being
addressed by other means. Thus if you already have a working solution for
the problem it is a good thing. By the way, have you seen the solution
given to the same problem by PostgreSQL 7.1?



> --Firebird has a history of supporting multiple relational
> languages. The
> original language, GDML, was a more pure and powerful
> relational language
> than SQL. Supported external languages are mapped or
> compiled to internal
> BLR (binary language representation), which is then executed.

I think this is the feature which will be more useful. Is it
available a definition of GDML, or of BLR itself?

Do you think BLR can be extended enough for Tutorial D support
without breaking SQL support? If this is not possible we would need a fork.



> the future. At the moment it is rather difficult to add new
> data types:
> apparently modifications must be made in multiple files, and
> testing is
> difficult.

Yes, this may be thorny indeed. Keep in mind that in Tutorial D we
would need not only to add data types, but create a generic mechanism for
user definition of new types and its operators.



> --As noted above, I am unclear as to the relationship of
> Tutorial D and
> domain members/objects to the outside world including object
> implementation
> language. If there is a connection to an object
> implementation language
> and you are considering java, note that there is a strong
> desire in the
> firebird community to provide java as a user defined function
> language.

I don't know if I got you right, but the creation of domains
themselves is part of Tutorial D's scope, as well the assignment of scalar
values to some attribute. Obviously the creation of such values could be
performed either by Tutorial D itself or by the host language.

As for the use of other languages to define functions, it should be
supported but not required.



> --Modifying the internal engine capabilities may be harder, and more
> controversial. My first thoughts are that the major changes
> needed would
> be to allow joins on blobs. Here's how I think this could be done:
>
> Currently blobs "consist" of a blob id ( number) stored in the normal
> record, referring indirectly to the actual storage location
> of the data,
> which is then of arbitrary length spread out over possibly multiple db
> pages. Equality joins on blobs could presumably be computed
> reasonably
> efficiently by storing in the record not only the blob id
> but also a hash
> code (and possibly a length). Comparison of actual blob
> values would be
> required only if hash codes (and length) match.

As I told you above, BLOBs aren't really what I have in mind. But
for large values comparison in general, you suggestion is very good, as one
of the requirements of the Third Manifesto is that all types should always
have at least the equality operator.


> proposal in regards to firebird. I personally like it. I
> think we would
> have to hear more from you about what you have in mind to proceed much
> further. If you have more questions please ask: you may email me

For the moment I think the idea needs more discussion, since I've
never worked in the internals of any DBMS, nor I grasp all the Tutorial D
implications. I hope to learn as we proceed.

I had thought about creating a project at http://savannah.gnu.org/
or http://sourceforge.net/, and then further discussions could continue in
some mailing list created there... do you think someone from firebird or ib
would have an interest in participating in such a discussion in such a list?

Thank you very much for your attention!


--
_
/ \ Leandro Guimarães Faria Corsetti Dutra +55 (11) 246 96 07
\ / Amdocs Brasil Ltda, São Paulo +55 (11) 3040 8913
X http://geocities.com/lgdutra/ mailto:leandrod@...
/ \ Campanha fita ASCII, contra correio HTML mailto:lgdutra@...