Subject Re: [Firebird-Architect] Re: Special Relativity and the Problem of Database Scalability
Author Paul Ruizendaal
> What Jim means is that if you record the state of the database
> with all nodes stopped and all uncommitted records discarded,
> then start all nodes, run some number of transactions, then
> stop all processing on all nodes and discard all uncommitted
> results, the state of the database on all nodes will be identical
> and it will be equivalent to the result of having run all those
> transactions serially in some order.

That is indeed how it should work, but note that there will many, many
serializations that will take the database from that initial state to that
final state. On a distributed system, each node may take a different
serialization and clients could see results not possible on a single
system. E.g. we have two non-conflicting updates (A and B)and three reads.
On node 1 first A happens and on node 2 first B happens. A client could see
update A and not B on its first read (having been routed to node 1), then
see update B and not A (having been routed to node 2) and then see both A
and B later (on either node). This could not happen on a single system, but
is perfectly legal in my view (and in Jim's view as I understand it).

In my opinion the definition you give above is satisfied if conflicting
updates have a consistent order, and all other access has a fifo order
(both terms as used with GCS's). Agree?

> As for the single lost node problem, not so important since
> everything is in at least two places and there is failover
> for all chairman activities. A node that has lost communication
> comes back like any new node. Partition is a harder problem.

Yeah, the "split brain" problem is not so easy. Perhaps the solution is
for the part that continues with less then 51% of the nodes to disqualify
itself from further processing, but that approach could trigger progressive
failure.

Paul