Subject Re: hadoopdb parallel database
Author paulruizendaal
Marius,

Perhaps it is more interesting to have a look at ScimoreDB's approach to scaling out:
http://www.scimore.com/products/distributed.aspx

They way I see it, it takes a classical RDBMS and automates most stuff for a sharding type cluster:
http://www.scimore.com/products/distributed-Technology.aspx
and it claims to scale pretty well:
http://www.scimore.com/technology/tpc.aspx
(as long as the application doesn't have hot spots, the claimed scaling sounds credible); it is organised in a symetric binary tree, but it is arranged differently over the available machines for each connection:
http://www.scimore.com/doc2/Network_Communications.html

Now, there is plenty wrong with this design (HA is hard, live reconfiguration is hard, rebalancing the data when a new sharding key is needed is hard, to name a few) but it can serve a whole range of scenario's.

I'm not sure the blr API still exists in FB head, but if it does the Scimore design could be done by taking the SQL compiler & optimiser, breaking it out to a separate binary and enhancing it to do the automated sharding, replication, etc. and send appropriate blr to each node in the tree.

The other weak point in the Scimore approach is that each node is a classical engine, with fat connections, no distinction between short and long transactions, disk based durability, etc. Moving to a more specialised engine at each node could make the design 10x faster than the numbers reported by Scimore.

Paul


--- In Firebird-Architect@yahoogroups.com, marius adrian popa <mapopa@...> wrote:
>
> I will try to see if it can be used with firebird nodes
>
> http://radar.oreilly.com/2009/07/hadoopdb-an-open-source-parallel-database.html
>