Subject Re: "Procedural engine" modules
Author paulruizendaal
--- In Firebird-Architect@yahoogroups.com, Jim Starkey <jas@n...>
wrote:
>
> Roman Rokytskyy wrote:
>
> >>I suppose a low level ad hoc interface could be cobbled together,
but
> >>JDBC is almost certainly the right answer. A JDBC implementation
> >>requires a layer to map JDBC semantics into engine semantics (one
> >>could be recycled from my ODBC driver).
> >>
> >>
> >
> >I planned to reuse JayBird for this. The bottom-most layer of
JayBird
> >is Java implementation of GDS API, layers above map engine
semantics
> >into JDBC semantics. So, if we get GDS-like API we will be able to
> >create JNI proxy between engine and JayBird. Rest is almost trivial
> >then. Or am I wearing pink glasses?
> >
> If I were king and dictator, I'd declare the existing API to be a
legacy
> interface, and find something a little more modern. My first
choice
> would be JDBC semantics as the most principled generally accepted
> database API available. While first crack might layer a JDBC
> translation layer on the existing DSQL/BLR engine, I'd evolve it
into a
> SQL engine coexisting with the BLR engine, sharing OPT, EVL, VIO,
IDX,
> etc. If I started with the existing C++ JDBC layer to support
embedded
> Java JDBC access, I'd not only have a higher performance subsystem,
but
> one that could be the core of a more modern engine.
>
> If my only goal were embedded Java JDBC access, I would consider
your
> approach.

I've been thinking about this for a bit. For my purposes (stored
procedures) I need to be able to serialize the compiled request. BLR
isn't all bad for that, although it could be better now that DSQL is
server side.

For truly dynamic SQL, Jim is right: it would be better to have a SQL
compiler that translated directly to a JRD_NODE tree, with the
optimiser called as a last pass. I don't agree with Jim that EVL
is 'good' and looper is 'bad'; in my opinion it is a single VM, split
over two files. Arguably, the VM design can be streamlined.

I suppose it would be quite feasible to do such a SQL parser with a
new API, which would be JDBC(-like). In due course I would need to
implement this too, to allow my BTC-core to execute dynamic SQL
efficiently (i.e. SQL that is dynamically generated in compiled,
stored PL/SQL code).

However, for my feeble abilities this is a bridge too far (hey, that
bridge is less than 50 miles away :-> ). As a first step I'm taking
the same route as Roman -- if it works in ExecuteStatement.cpp, why
wouldn't it work for us ?

> >>It also requires a mechanism to synchronize JDBC object
> >>lifetimes with corresponding engine objects/blocks. The work is
> >>well worth doing, but non-trivial, particularly if you're going
to
> >>"do it right."
> >>
> >>
> >
> >Will this not automatically handled by JVM? I have little
experience
> >in JNI, but if there is call from engine to JVM to execute some
Java
> >code, there must be some call to stop execution, right? Also when
the
> >code is executed, call we be returned back to the engine anyway.
Correct?
> >
> >
> The problem is lingering semi-orphaned JDBC objects referencing
> potentially deleted native objects. The engine really has to track
Java
> object lifetimes to break the association of Java object to
its "native"
> counterpart or face an ugly situation when a lingering object makes
a
> native reference to a C++ object. If Java programmers could be
taught
> to close objects when they're doing playing with them and not to
start
> threads when the whim strikes, it wouldn't be a problem. But now
that
> CPU and memory is almost free and almost infinite, kids don't learn
> these things until their regularly scheduled mid-life crisis.
>
> Believe me, the most difficult of Netfrastructure is keeping
everyone
> happy when a connect exits with active statements and resultSets
that
> need to be closed before the Java garbage collectors gets around
(if
> ever) to finalizing the objects. A really nasty multi-thread race
> condition.

Hmmm... I've got the nasty feeling I'm missing the full meaning of
Jim's point, but for what it is worth:

Probably at least at the C++ side, result set objects are accessible
from statements and statement objects from connection objects and
vice versa. If a result set gets finalised, the C++ result set object
should close the result set and notify the statement object of its
destruction. If a connection gets finalised, the C++ connection
object should notify statements to clean up, which propagates to the
result sets. If these are subsequently accessed an error should be
thrown. If a cleaned up result set gets finalised, the close step is
skipped. A single, dedicated mutex can avoid races between clean-up
and finalise code. Sure, a lot of boilerplate code but not of
mindboggling complexity.

Paul