firebird-architect - Re: [Firebird-Architect] Re: Well, here we go again

Subject	Re: [Firebird-Architect] Re: Well, here we go again
Author	Pavel Cisar
Post date	2008-06-24T11:06:51Z

Jim Starkey napsal(a):

>>
>> It all boils down to semantics. Relational calculus doesn't care about
>> meaning, period. You can join apples to oranges as long as it happens
>> that joining attributes are "compatible" (yeah, that ought to be solved
>> to some degree by more ingenious type system, but not completely), but
>> the main problem is that the result has no semantics attached and could
>> be completely bogus and there is no way you can tell. Sure, most of it
>> is in fact built in constraint of SQL (or any other relational query
>> language), but it's that way because relational model is based on
>> relational algebra, which is just that - an algebra, a computational
>> tool. It's hard to impossible to carry semantics through relational
>> computation. So all relational query languages basically gave up on this
>> and left all responsibility to developers. And developers who live in
>> world of CPU cycles, algorithms, registers, numbers, strings, objects
>> and data structures generally sucks at seeing through information
>> representation (data) to actual information.
>>
>>
> Pavel, I have read your post over and over and still can't heads nor
> tails of it.
>
> When I referred to semantics, I was referring to the SQL language
> semantics. SQL syntax has its good flaws and its bad flaws, but it
> generally gets the job done. When you are talking about semantics, you
> clearly have something else in mind.
>
> What are the problems you see with relational semantics (it's been years
> since I've heard someone hold forth on the relative merits of relational
> calculus vs relation algebra, but we'll let that one pass)? What are
> the data semantics (or lack thereof) that concern you? What concrete
> problems do you have?

I meant semantics of the data, not the DML language. We express the data
semantics through schema (i.e. tables, views, their relations,
attributes, domain etc.). We can even process the schema automatically,
but all that semantic information is not used very well for queries
(most systems doesn't use it beyond basic physical type compatibility
checks or just for optimizations) and is completely lost at the query
output, as we get just homogeneous pile of rows and attributes.
Developer who submitted the query must know, and if he wants to pass the
result to another system for automatic processing, he has to "add" the
semantics again (to structure the output in XML for example, or send the
correct DML sequence to the target system that would store the data in
correct way to it's schema). It's mostly normal as he has to perform the
translation between contexts anyway (as the other system usually uses
different schema), but it depends on the manual labor of the developers,
it couldn't be automated, automatically verified, not to mention hassles
when any schema changes.

To sum it up, I think that tools provided by relational systems to
define data semantics are insufficient, poorly used by systems
themselves and completely lost for any use at the output.

Object databases are much better at this as data semantics expressed in
objects and their hierarchies and relationships are more expressive and
close to our real needs, and could be retained to the output, but these
systems are overly complex and have to deal with many implementation
issues that make them less efficient than relational ones.

> Among the problems I have with complex data relationships (persistent
> objects, CODASYL sets, entity/attribute, hierarchical) is that the DML
> is navigational, i.e. visiting a record and deciding where to go next.
> Among the serious problems with navigational DMLs I have seen are:
>
> * A change to the data structure usually invalidates working programs

If you meant change at physical level, it depends on implementation.
It's possible to shield application from underlying physical storage
changes as long as logical schema structure is kept, like in relational
model. When logical schema changes, you have to adjust the application
even when it would use relational DML.

> * Record at a time interactions are catastrophically slow over
> network boundaries

I don't understand this one. You meant that records are sent to client
for evaluation and decision where to go next? Can't imagine it's a
requirement for navigational DML.

> * Insufficient regularity for high level tools (dearth of 4GLs, etc.)

It depends. Object based systems doesn't have that problem, thought.

> Relational languages address these problems nicely, and giving a set of
> aggregating interfaces, can move things further along. So I don't
> particularly want the database system to model complex relationships
> other than primary/foreign keys and table inheritance. I'm sure other
> people feel differently, and I look forward to an interesting debate on
> the subject.

PK is necessary to establish identities, and FK are good only to define
relationships between classes / categories of identities. But you can't
express relationships between individual instances (you have to use a
table in between, which is awkward and inefficient), or express complex
and/or conditional relationships between classes as well.

> What I do want to do, however, is banish the idea of physical
> datatypes. Tiny item, short int, int, long int, big int, etc. can be
> handled with a single type "number" with a scale factor to handle
> currency (pennies -- with the state of the dollar, who cares?).
> Similarly, I want to fold char, varchar, and clob into single "text" (no
> bound or size). I actually did this inside of Falcon, though there is
> no hope of support from the MySQL server. If anyone wants to hear more,
> let me know -- the basic idea came from this list a fair number of years
> back.
>
> So, Pavel, what is it that you want to do and can't?

If databases would be programming languages, then flat files are like
assembler, relational databases like C and object systems like C++. I
want Python, or at least Java. And unification of data types is a good
start, but it's just a start :-)

best regards
Pavel Cisar
IBPhoenix