Subject Re: External procedures: call for discussion
Author Roman Rokytskyy
Hi Dmitry,

> Now we have a design specification for another kind of external
> routines - Java ones.

If you mean the paper posted here, that's not 100% correct. We tried
to avoid mentioning Java there as well as define interfaces generic
enough, so that practically any VM can be used (.Net, Python,
JavaScript and VBScript if you want).

This paper is about how engine's internals should look like, so that
others can create "interpreters" for different languages.

> I suggest to follow one rule - distinguish
> between internal and external routines, both semantically and
> syntactically. It means that internal routines are part of a
> database and they are CREATEd (and also
> ALTERed) there, whilst external ones are only DECLAREd. Obviously,
> internal routines are stored inside a database, whilst external ones
> - outside. It allows to (a) preserve the current DDL syntax and (b)
> clearly understand what is what.

Ok, it sounds logical. I will change the paper to reflect this.

> Then it becomes obvious that we now have two types of external
> routines and we need to separate them. More types may be expected to
> appear in the future. Three things required for that are:
>
> 1) some symbolic qualifiers (like the proposed {JAVA | NATIVE} ones)
> in the DDL syntax

symbolic qualifiers should not belong to SQL syntax, otherwise adding
new VM would mean changing engine.

> 2) new field (e.g. RDB$MODULE_TYPE) in RDB$FUNCTIONS
> 3) execution subsytems for different types of external routines


> But I tend to disagree regarding selectable procedures. I see no
> reason to separate selectable and executable ones. In fact, if a
> procedure contains one and only one SUSPEND, it's both selectable
> and executable. This whole issue is perhaps quite tricky and
> sometimes badly understood by newbies, but it gives some power to
> those who know how the things work.

On the other hand it is not clear what to do with the selectable
procedure that is executed with EXECUTE PROCEDURE statement. If
procedure performs some database modifications in the FOR SELECT ;
SUSPEND; loop, fetching only first row leads to incorrect results. I
would rather disable this feature.

> Below is a phrase from your paper:
>
> "Only procedures marked with SELECTABLE keyword can be used only in
> SELECT statement. EXECUTE PROCEDURE statement is used for
> non-selectable procedures."
>
> The first sentense can be made true without breaking anything
> (select from non-selectable procedure always returns a NULL
> resultset, so nobody should use such a code), but the second one is
> a real problem.

Why is it a problem? It might require some work, but in theory it
requires only checking if procedure body contains SUSPEND keyword. We
can even drop need for the keyword, but during BLR compilation simply
set some flag in RDB$PROCEDURES table. And when compiling SELECT or
EXECUTE PROCEDURE statement simply check if value of the flag is
appropriate for the statement.

> And if this issue cannot be solved completely, I
> tend to think that any partial solution will only cause more
> confusion to end-users.

Agree. No partial soultion should be introduced.

> Now regarding the following statement "It should be possible to call
> procedures that return only one output parameter as functions, e.g.
> SELECT myProc(someCol) FROM someTable should be valid SQL
> statement". I'd prefer to declare such functions explicitly, via
> DECLARE EXTERNAL FUNCTION. It's not a matter of the parser only,
> because stored (i.e. internal) functions are required by the SQL
> specification to have slightly different syntax than stored
> procedures, so there won't be any confusion there. But if we clearly
> distinguish between internal procedures and functions, why making
> external ones to behave separately? Our goal is to make the
> difference (from the engine's POV) between internal and external
> routines as much transparent as possible.

My suggestion holds not only for external procedures, but for internal
as well. I know there was a feature request in some list (or maybe
Russian newsgroup) to implement this. Probably Eugeney will have some
comments about your ideas.

> I didn't follow closely the interface details posted in the
> proposal, but generally I agree with all the points. As for the
> selectable procedures implementation, I'd prefer the first option
> proposed.

It seems to be much more closer to engine internals than the second
one. I will drop the second option from the next version of the paper.

> About the internal routines... I see Paul is keen to detach the
> entire PSQL implementation from the engine and make it an
> independent (although still built-in) subsystem under the
> aforementioned interface. I don't have strong pro's or contra's at
> the time being and would like to discuss these ideas later, once the
> external stuff is agreed on.

I think this idea has one advantage. If we agree that PSQL works
through the same interface as other external procedures, we will have
a possibility to check if this interface is implemented correctly by
simply running all existing procedures that we have without need to
implement something in Java/C#/etc. Consider this as an existing test
suite for the interface.

Roman