firebird-architect - Re: External procedures: implementation proposal.

Subject	Re: External procedures: implementation proposal.
Author	Roman Rokytskyy
Post date	2005-07-26T08:47:09Z

Jim,

> Paul, this means the Fyacle and Roman's external procedure mechanism
> will not run with Vulcan. I think it is a serious mistake to
> implement a major feature knowing that it cannot be supported in the
> future. If you need that feature in the immediate feature, I think
> it would be better if you maintain it in your private tree and not
> attempt to merge it into Firebird.

Paul does not try to merge the feature into Firebird, this proposal is
mainly Eugeney's and mine initiative.

Also, please note that we have published this document before doing
anything in the Firebird tree to gather input from interested parties
even having more or less an agreement for overall design in the group
of people that did it.

> I haven't the slightest objection to Roman's external procedure
> mechanism appearing in a private version, but I have a great deal of
> trouble understanding how we can accept code that breaks our
> architecture and can't be integrated with versions now under
> development.

I do not see this proposal breaks the layering, since there is nothing
about the engine callbacks. It is about the API that external
procedure/module have to implement in order to be plugged into the
engine, how the engine discovers the external procedure, how it passes
the parameters and how it obtains the values. Not less and not more.

So, please, let's discuss this thing first, as the concept of an
external procedure itself does not imply that there will be a call to
the engine. It can happily load data from some file and present it to
the engine as result set, it can talk to Oracle or DB2 where the
IscDsc API does not apply.

So, as to your suggestion to use the IscDsc ResultSet as a procedure
output. I have downloaded yesterday Vulcan and went through the
sources. I cannot say that I understand now how Vulcan works (though I
will learn it), but I see the following things that are relevant to
our discussion:

- there is a ResultSet interface defined in Connection.h.

- there are two implementations of the ResultSet interface, one in
IscDsc directory (IscResultSet), one in jrd directory
(InternalResultSet).

- Both define its own Value classes which basically provide the data
type handling/conversion.

- The IscResultSet works internally with the ISC API and corresponding
Value is a wrapper for the XSQLDA/XSQLVAR.

- The InternalResultSet works with jrd8_XXX calls (which are similar
to the ISC API) and the corresponding Value wraps the dsc structure.

> That could be done, but would be horribly inefficient. The internal
> SQL is designed to bypass the SQLDA handling, and consequently is
> significantly faster than DSQL. Taking Java JDBC code, turning it
> into DSQL API (SQLDAs and all) then remapping the DSQL back to JDBC
> strikes me has a miserable exercise in creative squandering of
> processor cycles.

Now, please note that in the proposal we pass the PARAMDSC structure
with the parameter data:

typedef struct paramdsc {
unsigned char dsc_dtype;
signed char dsc_scale;
ISC_USHORT dsc_length;
short dsc_sub_type;
ISC_USHORT dsc_flags;
unsigned char *dsc_address;
} PARAMDSC;

The dsc structure that is wrapped by the Value from the jrd directory
looks like this:

struct dsc
{
dsc()
{
dsc_dtype = 0;
dsc_scale = 0;
dsc_length = 0;
dsc_sub_type = 0;
dsc_flags = 0;
dsc_address = 0;
};

dsc (UCHAR dtype, USHORT length, void* address, SCHAR scale=0)
{
dsc_dtype = dtype;
dsc_scale = scale;
dsc_length = length;
dsc_sub_type = 0;
dsc_flags = 0;
dsc_address = (UCHAR*) address;
};

UCHAR dsc_dtype;
SCHAR dsc_scale;
USHORT dsc_length;
SSHORT dsc_sub_type;
USHORT dsc_flags;
UCHAR* dsc_address;
};

I do not show here the XSQLVAR as it is more or less similar, but is,
from my point of view, not relevant here.

So, please explain how the translation from dsc into PARAMDSC can be
less efficient than wrapping dsc with the Value interface? I think it
is at least equally as efficient as accessing it through Value interface.

Now, to explain why I do not like the idea to return the full
ResultSet implementation from the procedure: from my point of view it
is just too many methods to implement for a piece of code that is
primarily interested in returning raw data:

- there is no immediate need to support type conversions, since engine
needs raw data - I see no reason to provide a possibility to declare a
procedure with one parameter types while the implementation of the
procedure internally works with other data types and engine/external
procedure perform the type conversion via the corresponding ResultSet
getters.

- there is no need to support the ResultSetMetaData because of the
reason above - in the first place we need raw data.

- there is no need to support the findColumn call, as there are no
named columns - they are accessed by the position.

If we now hide the features described above we get a new simplistic
ResultSet:

class ResultSet {
public:
virtual void close() = 0;
virtual bool next() = 0;
virtual Value* getValue(int position);
}

So, how does it differ from the interface defined in the proposal:

class ExternalResource {
public:
virtual void close(ISC_STATUS* status)=0;
};

class ResultSet : public ExternalResource {
public:

virtual bool fetch(ISC_STATUS* status) = 0;

virtual void getValue(
ISC_STATUS* status,
int paramIndex,
PARAMDSC* returnValue) = 0;
}

The difference is in

a) using PARAMDSC instead of Value interface;

b) using ISC_STATUS vector instead of exception throwing.

c) the actual implementation uses separate error class instead of
ISC_STATUS, so the proposal has to be updated.

And considering the Value is just a wrapper for a dsc structure and
conversion from dsc into PARAMDSC is trivial, we have the same
interface as you suggest, only with less methods to implement.

So, where is the problem with this approach?

Roman