Subject Re: Architectural Cleanliness: CVT_move
Author paulruizendaal
--- In Firebird-Architect@yahoogroups.com, "paulruizendaal"
<pnr@w...> wrote:
> --- In Firebird-Architect@yahoogroups.com, Jim Starkey <jas@n...>
> > I'm sorry, I don't have any idea of what you're talking about.
> > Could you explain it greater length?
>
> Sure. Will probably be after Xmas, though.
>

I found a little time before Xmas. Herewith an elaboration on
the "virtual machine" idea. It is still somewhat sketchy, but I think
the main points are there. I'm not sure I've sequenced it right.
Perhaps best to first give it a quick read and then a thorough one.

Let me try to explain my train of thoughts by outlining some little
changes to the current VM and defining a few phrases.

First, imagine that the impure area is not an extension of the
request record, but a separate memory block. I'll call this
the "impure block", although I think of it as a heap allocated stack
frame.

The remaining request structure has an added pointer to the above
separate impure block (probably some fields need to be moved from
request structure to the impure block) It also has has a state field,
which indicates the running state of the request. I'll call it
the "request block". It may help to think of it as a process
descriptor, as one would find in an OS scheduling module.

Next there is the JRD_NOD tree. I'll call this the "code block" and
refer to individual nodes as "instructions". I think that byte codes
would be better than nodes, but that is perhaps a matter of taste:
conceptually, nodes are serializable too.

Now, think of looper/EVL as a virtual CPU, excuting instructions in a
code block (text segment) refering to data stored in the impure block
(data/stack segment). I imagine that the code block has a header of
some kind describing the initial impure block.

Last, imagine a simple scheduler that assigns this virtual CPU to
whatever request is runnable (ie. not stalled). The CPU registers get
set (load current node "PC" and set the impure block pointer "BP").
That request runs until it reaches a stalling instruction; the
registers are saved in the request block and another request is
scheduled to run. A request can be 'unstalled' by sending it a
message, or reading a message.

This, in a simplified and warped description, is how the VM currently
works (ignoring threads and that scheduling is implicit). I hope I
haven't totally misunderstood the code.

Let's see where we can take this warped view of the VM.

[1] Simplify engine internal requests

Currently internal requests are enclosed at the top in code to find a
free request or to clone it or - if here 1st time 'round - to compile
blr to a request structure, and at the bottom in code to free the
request.

With an explicit scheduler, the code to clone a request can be
centralised there: a request block is created and, using the code
header, the impure block is set up. The request is set "runnable". If
we don't store BLR but the code block directly (ie. upgrade gpre),
starting an internal request would be as easy as passing the
scheduler a pointer to the code block (+ db + transaction) and
saying "run this". With BLR, the scheduler would have to check for an
earlier compile of that BLR, for instance based on the address of the
BLR string or some other guid.

[2] More versatile scheduling model

Currently, the thread that "unstalls" a request by reading/writing a
message also runs the request till the next stall. We can separate
those threads: there could be a number of "virtual CPU" threads that
the scheduler could exclusively use. These threads would be allocated
to the requests that were in the runnable state.

A client thread that wanted to read a message and found the message
port empty and the request to be running or runnable would have to
wait for it to stall. Similar for writing a message and a full port.

Also, in computationally complex requests, it could be helpful for
such complex requests to yield their thread from time to time, so
that a higher priority request can be scheduled, if there is one.
Could even be preemptive.

Perhaps the scheduler is a good place to put load balancing code. To
me, there seems to be a lot of OS technology that we could usefully
apply.

[3] Outline of a pluggable procedural engine interface

The execution of a request does not necessarily stall only when a
message is exchanged with the client. It could stall equally well in
EVL. An external procedure could be called by sending a message to
the external engine and then stalling the request. The external
engine (running on another thread or as another process) would pick
up the message, do its thing and send a message back. This reply
would again unstall the request, and the the next instructions in the
code block would fetch this result message.

I think there are only two occasions where messages are sent to an
external engine, initiated by the relational engine: calling an
external function inside an expression being evaluated and calling an
external procedure as body of a trigger. Perhaps fetching the next
record from a procedural record stream is a third.

The messaging involved would be quite complex, but not unduly so for
the task at hand - running pluggable external engines. Perhaps we can
reuse some ideas from micro-kernel OS's to optimise the messaging in
this interface.

The external procedural engines could potentially call back into the
engine by requesting the scheduler to create and execute a new
request, passing back the db & tr handles and starting a message
exchange like clients do.


Hope this makes it a bit more clear.


Paul