firebird-architect - Re: Call for ideas

Subject	Re: Call for ideas - CONNECT BY
Author	paulruizendaal
Post date	2004-04-02T10:33:40Z

> Absolutely. It's a bad idea that comes bad every decade.

Well, I kinda figured that already. Oracle puts all sorts of limits
on CONNECT BY (like "can't do this", "don't combine with that") and
so far that has always signaled that either the concept or the
implementation sucked. Or both. The PostgreSQL implementation lives
as a patch because the core developers thought CONNECT BY was bad,
or so I'm told.

People seem to be much milder about the DB2, Max/SapDB, SQL99 concept
of a recursive view/cursor ("WITH" syntax), so I'd be interested to
hear your opinion about those.

> Unless they're trying to convince you to implement something
> to get the hell away from the thing they currently are using
> and hate. Like, for example, Oracle. They hate Oracle and want
> something just like it so they can get away from Oracle and have
> something new to hate.

Yup, that is the cross I have to bear. I don't care much for (+) join
syntax, but I have to support it. "Hey, here's a way cool database,
rewrite your entire app suite to use it" is not much of a sales pitch.

> I knew it was a bad idea, but customers asked for it. I no
> longer give customers things that I know they won't like.

Ethically and technically you are right. Commercially you are wrong.
I am talking user base here, not money. I prefer to provide what
people want PLUS an escape route to The Right Thing.

> [..historical background..]

- You seem to use 'repeating groups' and 'hierarchical query' as if
those two phrases are the same concept. Correct?
- What is a 'flattening join'? One that returns the entire result
set as a very wide row? :^)

> It's a bad idea. No, it's a terrible idea. It's easy to
> implement -- nested for loops, another RSB type, and some
> hackery in the record format. I've done it many times on the
> same architecture and regretted it each time.

> Hint: you'll need to put in field grouping first. Don't call
> it that. Call it user defined datatypes. User defined datatype
> are way cool and field grouping is a discarded idea from COBOL.
> Then have a magic variable to control the number of repeating
> groups, and Bob's your uncle. Or get fancy and set it up so you
> can have any number of repeating groups per record (Hint: the
> right number is zero).

> Formats is going to bite you. Get leather pants.

Sorry, too terse for my limited current understanding. Please
elaborate.

> Oh, I can explain RSBs, but not tonight.

Well, you already did explain the basic nuts and bolts. Let me try to
repeat so that we can check whether I understood you properly:

The RSE module is a class in disguise:
- Abstract base class, with subclasses for various types, like
. basic table
. indexed
. sorted
. external table
. selectable procedure
- Three methods: open(), next() and close()
- The C equivalent of the object data is a RSB block.
So far, so good. This explanation helped me a lot.

Now we get to the part that is hard to understand: what is the
structure in the subclass types? The full list is:

enum rsb_t
{
rsb_boolean, // predicate (logical condition)
rsb_cross, // inner join as a nested loop
rsb_first, // retrieve first n records
rsb_skip, // skip n records
rsb_indexed, // access via an index
rsb_merge, // join via a sort merge
rsb_sequential, // natural scan access
rsb_sort, // sort
rsb_union, // union
rsb_aggregate, // aggregation
rsb_ext_sequential, // external sequential access
rsb_ext_indexed, // external indexed access
rsb_ext_dbkey, // external DB_KEY access
rsb_navigate, // navigational walk on an index
rsb_left_cross, // left outer join as a nested loop
rsb_procedure // stored procedure
};

So how could this list make sense?
a. The most obvious option would be table types. Here we would
have internal table, external table and procedure (add: cursor)
b. The next option could be access types. Here we would have
sequential, index and bitmap -- perhaps others
c. A third option could be aggregation types, such join, union,
sort, etc.
My feeling is that in the original design subclasses were access
types. Later aggregretion types were added and table types last:
these could only have sequential access, to avoid a cartesian product
of types + other reasons (like: not easy to put an index on a
procedural source).

Then we get to the real confusing bit: skipping records. On the one
hand we have subclasses FIRST and SKIP. Huh? This fits neither a. or
b. or c. On the other hand we have some (but *not* all) subclasses
implementing "next(offset)", instead of "next()". Go from confused to
flabbergasted.

Does this sufficiently elaborate on my question "Can somebody explain
rse_types to me?"

Paul

PS: I do realise that probably most rse types were added by Borland
and that the Firebird community could be as confused as I am on this
topic.