firebird-support - Re: [firebird-support] Firebird and sharding ?

Subject	Re: [firebird-support] Firebird and sharding ?
Author	unordained
Post date	2012-03-28T21:35:33Z

---------- Original Message -----------
From: "nathanelrick" <nathanelrick@...>

> Is it possible via the api to shard a database ? I mean to split the
> database in several database (this we do manually), and call the SQL
> in parallele on each database and aggregate the data at the end (this
> throught the api) ?
>
> i can do it from the application but it's just few line of code and
> it's can add great feature to firebird !

------- End of Original Message -------

execute block (query varchar(32000)) returns (columnlist...) as
declare shardurl varchar(500);
begin
for select dbname from shards into :shardurl do
begin
for execute statement :query on external :shardurl with common transaction
into :columnlist... do suspend;
end
end

There. Now you've got a union-all, from all the shards, unsorted (each chunk
might be sorted) and because execute-block can't be used directly in a select
query, you'll need some other solution to re-sort the result (either a
permanent valve procedure, or temp-tables, etc.)

I don't think this is a good solution as-is. It's not automatic, the queries
are not truly run in parallel, the execution plan can't be optimized for down-
stream needs (like sorting or filtering) ... but it achieves one thing: it all
runs inside the db. Yay. That does mean you can use it as a common back-end for
lots of stored procedures, reports, etc. while a hand-built, end-user app
(unless it re-exposes itself as a generic datasource) would be harder to use
"in the general case".

I *could* see a situation where this is useful if you have an 'index' database
that redirects you to only one shard, dynamically looking it up.

As to updates: if doing the above (identifying & using 1 particular shard at a
time through a central query node, or, I suppose, a set of "contact nodes"
surrounding the core storage nodes), it's possible to create views on top of
stored procedures [that source the data], with insert/update/delete triggers to
handle the changes [again through external execute-statements], though the
performance on that would ... well, it'd be nothing worth writing home about.

-Philip