| Subject | Re: [Firebird-Architect] Interesting paper | 
|---|---|
| Author | Paulo Gaspar | 
| Post date | 2010-09-08T17:35:36Z | 
Hi Paul,
There are a lot of so called NoSQL databases using Vector Clocks, which are derived from Lamport's work:
http://en.wikipedia.org/wiki/Vector_clock
(There is another field that should really be aware of Vector Clocks: the SOA / EAI / B2B thing... but I digress.)
The best starting point to understand the principles those databases are based on and the kind of techniques they use is the original Dynamo paper:
http://www.allthingsdistributed.com/2007/10/amazons_dynamo.html
You will find a lot of references and summaries but nothing beats the real thing. Reading about the Dynamo paper without reading itself is a waste of time. More so because of the hands on resulting stats - from Amazon's production environment - they present.
It was because this paper was so awesome that we now have to put up with the NoSQL hysteria...
=;o)
But many of those "NoSQL" databases sure are aware that Query Languages are a good thing:
http://hadoop.apache.org/pig/
If you find this issue interesting, take a look at my presentation at:
http://www.slideshare.net/paulogaspar7/distributed-programming-and-data-consistency-w-notes
The real juice is not on the slides but on the annotations: I collected a nice set of references (both blog posts, articles and academic papers) about distributed data processing techniques, data "consistency" (whatever that means), etc.
Have fun,
Paulo Gaspar
            There are a lot of so called NoSQL databases using Vector Clocks, which are derived from Lamport's work:
http://en.wikipedia.org/wiki/Vector_clock
(There is another field that should really be aware of Vector Clocks: the SOA / EAI / B2B thing... but I digress.)
The best starting point to understand the principles those databases are based on and the kind of techniques they use is the original Dynamo paper:
http://www.allthingsdistributed.com/2007/10/amazons_dynamo.html
You will find a lot of references and summaries but nothing beats the real thing. Reading about the Dynamo paper without reading itself is a waste of time. More so because of the hands on resulting stats - from Amazon's production environment - they present.
It was because this paper was so awesome that we now have to put up with the NoSQL hysteria...
=;o)
But many of those "NoSQL" databases sure are aware that Query Languages are a good thing:
http://hadoop.apache.org/pig/
If you find this issue interesting, take a look at my presentation at:
http://www.slideshare.net/paulogaspar7/distributed-programming-and-data-consistency-w-notes
The real juice is not on the slides but on the annotations: I collected a nice set of references (both blog posts, articles and academic papers) about distributed data processing techniques, data "consistency" (whatever that means), etc.
Have fun,
Paulo Gaspar
On 2010-09-08, at 10:37, Paul Ruizendaal wrote:
> Paolo,
>
> Thanks for your tip. From that post I followed some links and stumbled
> across something I had not noticed before (perhaps stupidly so): Yahoo's
> Serpa/PNUTS system. It seems to use Lamport relativity as one of its
> concepts, but also uses sharding as a core concept with no joins across
> shards. I guess db engine architecting hasn't been this much fun since the
> early 80's.
>
> Paul
>
> PS I agree with Ann that there is much confusion about terms in the field
> currently. Not only is there the serialisable/isolation misunderstanding,
> also "consistent" is often used to mean "atomic". The PNUTS paper is but
> one example of this.
>
> On Tue, 7 Sep 2010 18:19:04 +0100, Paulo Gaspar <paulo.gaspar@...>
> wrote:
> > There is already a brilliant reply / rebuttal:
> >
> >
> http://yz.mit.edu/wp/infrequently-asked-questions-on-deterministic-distributed-transaction-management/
> >
> >
> > Have fun,
> > Paulo Gaspar
>
[Non-text portions of this message have been removed]