Subject | scalable joins |
---|---|
Author | paulruizendaal |
Post date | 2009-02-22T10:29:47Z |
Hi all,
highscalability has some interesting write-up about google app engine
and its datastore. One blogger is quite disappointed with its
limitations:
http://catherinedevlin.blogspot.com/2008/09/bigtable-blues.html
http://catherinedevlin.blogspot.com/search/label/geekeventaggregator
Google has this post:
http://googleappengine.blogspot.com/2009/02/back-to-future-for-data-
storage.html#
It makes a bold statement:
"This isn't an accident -- when you build a system that can scale to
the size that Bigtable can there's no way to do a general purpose
join on data sets that size and still have them be performant."
Now my question is whether this is true: is it mathematically
impossible to do distributed joins in a generalized way?
Any ideas anyone?
Paul
highscalability has some interesting write-up about google app engine
and its datastore. One blogger is quite disappointed with its
limitations:
http://catherinedevlin.blogspot.com/2008/09/bigtable-blues.html
http://catherinedevlin.blogspot.com/search/label/geekeventaggregator
Google has this post:
http://googleappengine.blogspot.com/2009/02/back-to-future-for-data-
storage.html#
It makes a bold statement:
"This isn't an accident -- when you build a system that can scale to
the size that Bigtable can there's no way to do a general purpose
join on data sets that size and still have them be performant."
Now my question is whether this is true: is it mathematically
impossible to do distributed joins in a generalized way?
Any ideas anyone?
Paul