Subject Re: [Firebird-Architect] External Engines (and Plugins)
Author Adriano dos Santos Fernandes
Vlad wrote:
>> Syntax:
>>
>> { CREATE [ OR ALTER ] | RECREATE | ALTER } PROCEDURE <name>
>> [ ( <parameter list> ) ]
>> [ RETURNS ( <parameter list> ) ]
>> ENTRY_POINT '<entry point>'
>> LANGUAGE <language>
>>
>
> What is exact meaning of LANGUAGE clause ? From the explanations below i see
> it as plugin identifier, not related directly to any language.
>
External plugins implements languages. Languages are made available to
databases through the configuration file. This is almost how INTL works,
and allows simultaneous usage of many plugins implementing many
languages. The admin chooses in the config file what's used from what
plugin. Also this makes clears features available to databases.

AFAIK, this is also how original implementation works, but instead of
config the languages/libraries are chosen from a system table.

Writing external routines directly through the C++ interface is not easy
as write an UDF. And this is more difficult for others (non C++)
languages. So the things are break into two layers: plugins and user
libraries.

For example, there is the C++ (engine) plugin. It implements the CPP
language. Users will not reimplement it, but write libraries that
register routines in the C++ plugin.

Firebird talks with the plugin, and the plugin talks with user
libraries. This is also how Java routines will work, the plugin will
load user classes.


> Also, you removed EXTERNAL keyword from original implementation. What is the
> reasons for it ?
>
I changed EXTERNAL NAME to ENTRY_POINT. It makes things more likely UDFs
are. Also EXTERNAL NAME seems not good inside an EXTERNAL FUNCTION
declaration.

> ...
>
>
>> Here are links to latest version of files that worth discuss. Please
>> verify them:
>>
>> FirebirdApi.h is the generic new C++ API. These classes was designed to
>> future replace the ISC API in mind. Only the necessary classes and
>> methods for external engines was created -
>> http://firebird.cvs.sourceforge.net/firebird/firebird2/src/include/FirebirdApi.h?view=markup&pathrev=B2_5_ExtEngines
>>
>
> Why FB_CALL is defined as __stdcall for WIN32 only ?
Native calling convention from Windows is STDCALL. Also, the default
calling convention for C++ functions in MSVC is THISCALL, that is
STDCALL. Certainly, we may define FB_CALL as CDECL, but since COM used
STDCALL, I think we should use it.

And finally, the ISC API uses it too.

> Why do you use MSVC
> extended syntax ? IIRC plain "stdcall" is supported by every compiler while
> "__stdcall". is MS extension.
>
This is how we define it in public ibase.h.

> How memory for Values is allocated and freed ? Who is responsible for memory
> management ?
>
Values is allocated by Firebird and freed by it. It's just passed to
plugins and valid during the calls. It's how it's going to work inside a
ResultSet too. It's always freed by who allocated it, so no need to
extends Disposable.

> Why values within Values class enumerated starting from 1 but not zero ?
>
Do you talk about "index" parameter? This is how almost all SQL
libraries works. I prefer indexes starting from 0, but I give up on this.

> ...
>
>
>> Firebird Plugin API:
>>
>> When Firebird is initializing, it opens all *.conf files from
>> <fbroot>/plugins. For each plugin_module tag found, it constructs a
>> Plugin object, reads the corresponding plugin_config tag and inserts all
>> config information in the object.
>>
>> It then gets the attribute value of plugin_module/filename, load it as a
>> dynamic (shared) library and calls the exported function firebirdPlugin
>> (PluginEntryPoint prototype) passing the Plugin object as parameter.
>>
>
> So, Firebird loads *all* plugins at initialization time ? Why ?
> If there is needs to load some plugin (or some plugin's kind) within the engine
> itself lets add corresponding attribute in config file, but please, don't load
> all not needed libraries at startup.
>
Seems ok.

>
>> The plugin library may save the plugin object and call they methods
>> later. The object and all pointers returned by it are valid until the
>> plugin is unloaded (done through OS unload of the dynamic library) when
>> Firebird is shutting down.
>>
>
> I think reference counting is much better for such usage. Also it allows to
> reload plugin library without stop of Firebird.
>
If we implement plugin reload, the plugin will be need to be notified by
some way and can/will reload its state. I don't see any need for
reference counting here.

>
>> Inside the plugin entry point (firebirdPlugin), the plugin may register
>> extra functionality that may be obtained by Firebird when required.
>> Currently only External Engines may be registered through
>> Plugin::setExternalEngineFactory.
>>
>
> How it can be extended in future for new kinds of plugins ? Adding new
> Plugin::setXXXFactory ?
Yes. All API set are versioned so they can be extended and used
appropriated. But not all plugins will register through factories. Some
may need to call simple Plugin:::setXXX methods.

> It seems for me it is better to introduce enumeration of
> known plugin types and create only universal method to register all kinds of
> factories, such as
>
> Plugin::registerFactory(PluginKind, PluginFactory).
>
This will cause a plugin to extend a whole set of methods to just do one
thing.

> Also it seems better to use word "Plugin" instead of "ExternalEngine" for plugin
> API :)
>
I do not like "external engine" or ESP words from original
implementations. :-) But Plugin is generic. We may have the TracePlugin,
for example. I think "Language Plugin" is a more appropriate term.

>
>
>> Example plugin configuration file:
>>
>> <language CPP>
>> plugin_module CPP_engine
>> </language>
>>
>> <plugin_module CPP_engine>
>> filename $(this)/cpp_engine
>> plugin_config CPP_config
>> </plugin_module>
>>
>> <plugin_config CPP_config>
>> path $(this)/cpp
>> </plugin_config>
>>
>> Note that the language tag is ignored at this stage. Only plugin_module
>> and plugin_config are read. The dynamic library extension may be
>> ommitted, and $(this) expands to the directory of the .conf file.
>>
>
> What if more than one <language CPP> (or <plugin_module CPP_engine>, or any
> other section) section is present in config file ?
>
Errors are written to the log and none will load.

>
>> Plugins access Firebird databases through the client library read from
>> Plugin::getLibraryName method. This method may return different
>> filenames depending on the server architecture, and may even return
>> NULL. Currently it returns:
>> Architecture -> File
>> Embedded -> The embedded library
>> Windows SS -> fbserver executable [3]
>> Windows CS/SC -> fb_inet_server executable [3]
>> POSIX CS/SC -> The embedded library
>> POSIX SS -> NULL [application should open it through dlopen(NULL)] [3]
>> [3] The functions are exported direct in the executable. Not well
>> know/used technique, but works in Windows and POSIX.
>>
>
> I.e. plugin must load ISC API functions by itself and never used fbclient for
> this, correct ?
>
Yes. But in FB3 we may have a single engine library (firebird.dll/so)
used by all kinds of server and usage of it directly will work.

>
>> External Engines API:
>>
>> Entry points are opaque strings to Firebird. They are recognized by
>> specific external engines. A external engine is the implementation of a
>> language. Languages are declared in config files (possibly in the same
>> file as a plugin, like in the config example present here).
>>
>
> I still see no correspondence between natural meaning of word 'language' and
> how it is used. I can write plugin which will work with any dll written on any
> language and how i must register 'language' of my plugin ?
>
Language is registered in the config file so Firebird knows what plugin
it should ask for it. Plugins returns ExternalEngines based on language
parameter of ExternalEngineFactory::createEngine.

>
>> When Firebird wants to load an external routine into its metadata cache,
>> it gets (if not already done for the database [4]) the external engine
>> through the plugin external engine factory and ask it for the routine.
>> The plugin used is the one referenced by the attribute plugin_module of
>> the routine's language.
>> [4] This is in Super-Server. In [Super-]Classic, different attachments
>> to one database creates multiple metadata caches and hence multiple
>> external engine instances.
>>
>>
>> The C++ (CPP) engine:
>>
>> Entry points of the C++ engine are defined as following:
>> '<module name>!<routine name>!<misc info>'
>>
>> The <module name> is used to locate the library, <routine name> is used
>> to locate the routine registered by the given module, and <misc info> is
>> an user defined string passed to the routine and can read there. "!<misc
>> info>" may be ommitted.
>>
>
> Why it is better than passing this <misc info> as parameter ?
>
Because some things is better encoded in the metadata, and for triggers
there is no parameter. For example, see my REPLICATE trigger example.
The database designer chooses the datasource, and the trigger will read
properties of that datasource from a table.

>
> Regards,
> Vlad
>
> PS It seems as a good idea to mention original creator of code in headers. I
> mean at least Eugeney Poutilin.
>
Sorry. Will do it, certainly.

> PPS Do you plan to introduce interface for user defined aggregates ?
IMO we need first a generic way for user defined aggregates, i.e., it
should be possible to write aggregate functions in PSQL.

A way to do it may be through normal functions (but marked as aggregate)
that SUSPEND asks for new rows, instead of produce rows. Something like:

create aggregate function mult (
n integer
) returns integer
as
declare ret integer = 1;
begin
while (n is not null)
do
begin
ret = ret * n;
suspend;
end

return ret;
end


Adriano