Subject Re: [Firebird-Architect] Re: Coherence, ACID, and Clusters, et al
Author Jim Starkey
paulruizendaal wrote:
> "Platitude. Deduct two points. You might also say 'replicate too
> little and scalability collapses'".
>
> Poor debating style. And nope, the reverse isn't necessarily true.
>
> "The art is breaking a very large database into manageable size
> objects. There's more danger in making them too big than too small.
> Another part of the art is keeping request/response round trips to an
> absolute minimum. In fact, I'd say there's a lot of art to be done."
>
> Yup, and I thought we were discussing that art. Anyway, any design
> that requires more than minimum replication between nodes will smudge
> partitions over way too many nodes. Well, that's my view of the art.
>
> "I think more like processor affinity in an SMP scheduler. Send to
> process to where the data is likely to be in cache. If it isn't
> here, take the hit and get it."
>
> Easily said, but deciding where the data is most likely to be is not
> trivial -- especially in when designing in RESTful way.
>
> I hope you have thought it all through to a much more fundamental
> level than you give the impression in this thread.
>

Yes, I have thought it through, and no, I'm not prepared to go public
with the details at this stage.

Suffice it to say that the SQL engine is layered on an object
substrate. Individual objects persistent on disks attached to archive
nodes. When an object is required on a SQL node, it gets it either from
another SQL node or an archive node. Active objects are synchronized
with asynchronous, batched replication. Most objects are stable and
don't require synchronization. Active objects, given a good design,
reside on relative few nodes. If a node finds it has a busy object that
it doesn't really need, it drops it.

There are no pre-defined partitions. Data goes where it needs to go.
Partitioning, in my humble view, is a crude and awkward workaround to
the limitations of single computer, disk-based database systems. And, I
might add, having worked on a system that supported it, it left a bad
taste in my mouth.

But the point really is that data doesn't exist in any predefined
locations except the archive nodes, and each archive node has full copy
of everything but not SQL engine. So sending a query to the data is a
hopeless exercise. Sending like queries to the same place in
recognition that sooner or later the data *will* be there, however, is a
very good strategy. But initially, at least, the query gets there first.