firebird-support - Re: [firebird-support] Down time

Subject	Re: [firebird-support] Down time
Author	Artur Anjos
Post date	2004-08-02T21:26:35Z

Ann W. Harrison wrote:

> The reason for my question is MySQL's "cluster" support.
> My reading of it is that their "clusters" have nothing to
> do with what used to be called clusters - groups of
> machines that share disk but not memory. The MySQL site
> claims that their clusters give "high availability" aka
> "5 nines". Looking more closely, "5 nines" means 99.999%
> up time. Sounds good. 0.001% of a week is 10 minutes.

Hi Ann,

This is not what I understand by cluster support.
"Server clusters provide high availability, failback, scalability, and
manageability for resources and applications".

ie, if one server breaks down, another will take his place without any
user (or client application) notice it. The main problem is not software
related, but any problem related to hardware, from an hard disk failure
(the most commom problem in our days) to an nic card failure (not commom
at all).

Many enterprise customers requires cluster solutions. (If I have some
kind of an accident, I will love to know that my hospital server is
running in a cluster.)

I know that hardware failures are not very commom in these days, but
Murphy is always watching. If some database system has a problem in a
cluster environment, the problem should be replicated to other machines
as well, if the cluster is working ok. :-)

For small applications we do have some other solutions, that could be
more simple and minimize some damages. One of my clients, for example,
runs a 500Mb Fb database in a environment that as an average of 30/35
users connected. Server is a small IBM x-Series, so it was easy to tell
the client to buy another of this boxes (about 1000 euros). I backup the
data hourly to the other machine: if something fails with machine A, I
can setup machine B to start running just changing IP's. The all
operation should be done in 5 minutes or something. But it doesn't:
backup time is <2 minutes, but restore time is much more - more than 20
minutes (Fb 1). If something fails with Server A, I will take almost 30m
to setup server B. Much better than to setup a new operating system, or
to diagnose and fix the problem, but this will be for sure a 30 minutes
downtime, loosing some data on the way.

Not a problem for this company - the server is up from 1 January 2002
without a single crash, so I didn't have the oportunnity to check if
this process works "for real" - but a problem for sure for other
companies that use more critical data.

Nickolay's nbackup will improve this process dramaticaly, and I will be
abble to produce backup/restore cycles in a smaller time.

(I didn't speak about replication because this email is already too long)

Cluster solutions are a simpler warranty for a hardware breakdown. I'm
sure that in the near future cluster support should be more requested by
end-users, and we should keep cluster support in our "todo list".

Artur