firebird-architect - Re: [Firebird-Architect] Re: RFC: Clustering

Subject	Re: [Firebird-Architect] Re: RFC: Clustering
Author	Jim Starkey
Post date	2006-11-21T15:32:03Z

Lester Caine wrote:

> As you have said Jim, we need to define the problem. Simply saying it
> will not work is not the helping layout a plan as to what can and can't
> be made practical?
>

The issue is high availability. There are a variety of ways to make
clusters work. The original Interbase architecture was a case in point
-- it worked on VAXClusters from release 1.0. It did require a
distributed lock manager. If you remove the distributed lock manager
from the equation, that solution won't work.

The most promising technology for high availability is not clusters but
replication among independent servers. There are a number of models
that can be made to work, but chained replication where each server
replicates to the next guy down the chain. I've got a customer running
five way replication (last guy on the list is in a deep bunker in the
New Jersey wastelands). The key elements to the implementation were:

1. Multi-table triggers (machine generated per-table triggers could
do the same things less elegantly) to collect change records
2. An internal procedure to blast batched change records to replicants.
3. A sequence/generator mechanism that guarantees unique key
generation across the system.
4. A server to server communication mechanism to receive updates
5. An online database clone operation to bring new nodes into the chain.

A server entering the system would start with a database clone (age not
particularly important), determine from its MAC address how its
neighbors were, checking in neighbors, and start receiving stuff from
the replication log until he was up to date.

Implementing this in Firebird would take a bit of work. The guts of the
scheme assumes that there is a procedure/trigger language
computationally equivalent to Java (the server to server communication
mechanism is based on exchanged of serialized Java object clusters).
The clone operation is already in place in the shadow bootstrap mechanism.

The multicast to parallel server idea is really a non-starter.
Firebird, running two conflicting update transactions, is not itself
deterministic.