Subject Re: [Firebird-Architect] Re: External procedures: implementation proposal.
Author Dmitry Yemanov
"Roman Rokytskyy" <rrokytskyy@...> wrote:
>
> > First question: should we discuss external functions here as well or
> > is this a subject of a separate talk?
>
> Only after you present a description how the stored functions will
> work in Firebird.

The SQL spec declares this quite well ;-) Anyway, let's keep them away for a
while.

> > I don't like the idea of self-registering modules. They could be
> > described via the manifest files or they could contain a pre-defined
> > describe() call, but this is the engine that decides to write
> > something into the system tables.
>
> That would limit the possibilities that modules can provide. For
> example, if somebody wants to create a "full-text search module" that
> provides
>
> a) tables for internal index data structures
> b) procedures/functions that allow to index and search the repository
>
> he must then provide an SQL script that has to be executed to register
> stuff in the database. But if the module is allowed to perform some
> database tasks automatically during initialization and/or opening the
> database, it can do significantly more.

I still don't like it, sorry. But I don't think it stops our ESP discussion
:-)

> > How is the module name encoded in RDB$EXTERNAL_NAME?
>
> It's not. Language is specified. Database engine reads the
> RDB$LANGUAGE from the RDB$EXTERNAL_PROCEDURES, then reads the
> corresponding module from RDB$EXTERNAL_ENGINE:

This is for Java. For binary compiled modules RDB$EXTERNAL_NAME should
contain both the library name and the entrypoint. Should we split them in
system tables a-la RDB$FUNCTIONS or the external engine module should take
care about name decoding and library loading?

> > Shouldn't the external module be loaded on demand?
>
> I would load on server start to avoid any delays on first access and
> to see the configuration error right on the start, but I do not have
> any strong preference here.

Given the questions above, perhaps we should start distinguishing between
external engine modules and ESP modules. If we talk about the former ones
here, then I also don't have any strong preference. But I don't like to see
something residing in the memory even if nobody uses it. But the
auto-loading question could be also solved via the config options, so I
don't think this is a big issue.

> > I think that the ExternalResource::close() implementation must be
> > required to be tolerant to being called multiple times.
>
> Why? Usually this requirement comes together with unclear
> specification when what to close.

Perhaps you're right. I just don't want to see server crashes when it could
be avoided.

> > And I'd expect the implementation (the engine?) to return an error
> > otherwise.
>
> How are you going to detect this?

The engine calls attachThread() before calling ESP. So we have a list of
threads allowed to do this. The get_current_attachment_and_transaction()
code could return NULL handles if it's called from threads outside this
list.

> > I see no problems with spawning multiple threads for the
> > computation/communication purposes, neither with accessing
> > databases via new explicitly made attachments.
>
> I see already a problem here - you loose control over your child
> thread. Then you The computation model should be very similar to the
> one in EJB - no threads, no static variables, etc. The execution model
> has always explicit start and explicit end. When execution has ended,
> no part of the code that was executed is alive.

If the child thread is spawned at the beginning of ESP and terminated at the
end, what bad happens? And please also don't forget about other languages.
People have created a lot of multi-threaded UDFs for their needs, so they
expect to do the same in ESPs. With C++ or Delphi, you have abilities to
keep control over all resources you have.

> > Is the engine expected to return an error if the external procedure
> > called via EXEC PROC returned a result set object or should it
> > silently ignore the result set?
>
> Depends on the procedure declaration, I'd say.
>
> - If no OUT parameters are defined and result set is returned - an error.
>
> - If OUT parameters are defined and no result set is present - error.
>
> - If OUT parameters are defined and RS does not contain rows - all
> NULLs are returned.

Agreed.

> - If OUT parameters are defined and RS contains more than one row - error.

This one behaves differently from the current engine, but I agree here too.


Dmitry