Subject Re: [Firebird-Architect] RFC: Clustering
Author Eduardo Jedliczka
comments in text...

----- Original Message -----
From: "= m. Th =" <th@...>
To: <Firebird-Architect@yahoogroups.com>
Sent: Saturday, November 18, 2006 6:43 AM
Subject: Re: [Firebird-Architect] RFC: Clustering


> -------------
> Responding to Eduardo:
>
one more time: thanks.

>> First off all, I think CLUSTERS NODE fabulous, and I like a lot of yours
>> commentaries, you was used a lot of time thinking on it. But I beliave
>> shadows is not the better way. A Internal soluction like a flag used in
>> NBackup (with a PageVersion from any cluster node) much better...
>>
>> Of coarse we need the primary database server. But if we turn off a
>> secondary cluster node for a short time. I think the recovery time is
>> very
>> long with shadows (again I talk... it´s only my POV), and copying only
>> changed pages is a fast way to do the same thing.
>
> First of all, thank you very much for your input. In deed, is possible,
> in the situation in which you exposed above, that copying of the changed
> pages only to be faster. But we have some things to observe here:
>
> 1. The possibility that '*we* turn off a secondary cluster node' is,
> IMHO, let's say, very rare. AFAIK, "very few" administrators turns off
> their servers. The most probable case, and for these cases the clusters
> are designed, is when the servers crashes due to an let's say
> 'unplanned' event (hardware failure, power down, network cable out,
> viruses etc.). In this case, is probable that the affected database file
> to be corrupted. This corruption can be seen immediately by the engine
> or can be not. Anyway, IMHO, the database file isn't reliable anymore
> and a page header scan won't tell us which are the 'ill' pages (it will
> tell us only the 'changed' ones), and this is in the 'happy' case in
> which the db header structure remains intact. So, in my very humble
> opinion, doing a page synchronization (I think that we can call in this
> way the thing that you want to achieve) can be a very fast thing but a
> very fragile one. Also, the delta file which is created during the page
> sync process must be applied on both files, thing which requires a
> separate engine.
>
I understood, and you´re rigth! but I see a lot of network problems, O.S.
Crashes, eletrict problems making one or other computer restart (really...
it´s true!!!) And it´s more common as you think!

And most portion of time, this little crashs don´t destroy/corrupt the
database file (except with forced-writes off)... but, really it´s very
fragile.

> 2. The remote shadow engine is ultra-fast, at least on Windows where I
> have experience. At 1Gb the network is transparent, at 100Mb it has a
> very small lag (ie. in our tests 1% which means that at a test which
> keeps 100 secs without shadow, with shadow we have 101). This is
> achieved by writing synchronously on both files (main and shadow) only
> the changed pages. The node *creation* is done asynchronous (when we
> want - see the 'Pending' point above), so the creation time doesn't
> matter, only the node *activation* is required to be done with no one
> attached to the clustered db but the activation is blazingly fast: some
> bytes changed in the db header and a field from a record in RDB$Files.

Again you ´re right, but if the cluster is Linux, Mac OS, or *BSD ?

But some people need web replicating, or remote cluster (like oracle does).
Here, Band Width is a greath problem.

Think about: your cluster soluction is the "better replication soluction"
always done for FireBird. When it´s implemented, certainly I use it to do
remote replication.
>
> 3. Why do you want to 'turn off the secondary cluster node for a short
> time'? There are much more things to 'replicate' between nodes, not only
> the database file. IMHO, the file is the simplest. There are open
> contexts, TIDs, generators, metadata changes etc. That's why I choose to
> do the 'replication' directly from the client. IOW, the 'replication
> engine' is FbClient.dll which sends the commands to all nodes so we
> don't mix with DLMs, multi-host syncs, comm layers aso. So, turning a
> node cluster off, even for a 'short time' render it unusable due to the
> continuous flux of state changes (mainly contexts and TIDs) which must
> be kept in sync between the nodes. In my small implementation, a node
> can enter as active in cluster only when all the clients are out (see
> above about 'Pending'). IMHO, I don't see (now) a solution for
> replication live contexts over the servers.

Here we have a lot of needs, diferent problems to solve...

My needs is: only one database (in one server or a cluster) giving support
to dozens or hundreds of stations. Of coarse we have some "report stations"
(clients with large reports to print, and some times with massive SQL use).
And unluckly FireBird don´t have a good structure to extract the power of
new dual-core (and new quad-core) processors. If one station start a "big
report" a lot of clients still waiting.
If we have a "true cluster" or "replicated database only-for-print" this
need is solve or minimized.

This is my POV: the sync can be made sending DDL/DML commands between
cluster nodes, but again is need a sequencial-cache... (like redo-log in
oracle)... Nodes in cluster may have different performance... some nodes
maybe are more fast than others, and maybe have better network hardware (or
less coalision problems)

An auxiliary connection is needed inter-nodes sharing the transaction
counter, clients IP and/or DDL/DML to do (if a node down, or if a new node
was inserted), and a small logic if the best "sync" method is "shadow" -
full database copy, "page sync" - only a small portion of page copy,
"transactions reprocess" - using the redo-log to reconstruct an old version
node to current database version.

My first idea to do this is "intercepting ALL communications between the
true FireBird server and firebird clients" (like ibmonitor) and do the hard
job (replication and routing connections to nodes) by my self. WHY ? easy: I
can´t change the Firebird source code... I´m not a good C developper.

But if the official FireBird developers can implement this, I think a Layer
in real (primary) FireBird server distributing the clients connections to
the cluster nodes the best way. Of coarse, sharing client IP, contexts,
TIDs, generators, metadata, etc..

>
>
>> My first problem was UDFs (with access external files, drivers O.S.
>> resources or web access), and some "update Stored Procedures"... but IF
>> cluster support is made inside the DataBase... this operations can be
>> "replicated" correctly in cluster nodes.
>
> Anything which is 'inside' of database gets 'mirrored' on the nodes. UDF
> DECLARE statements are, the .DLL/.SO aren't. So, you must 'replicate'
> them 'by hand' ;)

yes! you´re rigth! again!

>
>> The other problem is USER control (grants, etc) ... but it´s other
>> story...
> This is solved (at least for Windows). The user which is used to start
> the process must have read-write rights on the share(s) on the other
> nodes.
>
> (A little helping hint - with love: Eduardo, perhaps will help that at
> the end to your messages to press F7 (I saw that you use Outlook 6) or,
> if you want change your eMail client to Thunderbird. It has
> spell-checking as you type. It is one of my ways to learn English. I'm
> also from a Latin country like you are)
>

OK... (but F7 do not work with my portuguese-brazilian outlook express)

>
> hth,
>
> m. th.
>

I know: problem is very large, and hard to solve... sorry if I only see my
problems...

but (maybe I´m in wrong place) I need show some FireBird deficiences, and
propose a little "theorical help" if wellcome (in other thread)

======================
Eduardo Jedliczka
Apucarana - PR - Brazil
======================