firebird-architect - Re: [IB-Architect] SOME SOLUTIONS TO OLD PROBLEMS

Subject	Re: [IB-Architect] SOME SOLUTIONS TO OLD PROBLEMS
Author	Dalton Calford
Post date	2000-06-22T23:43:49Z

Hi All,

I'm back, and todays lecture is, REPLICATION.

:)

------------------------------------------

People who begin to work with replication, think it is a simple process
of applying the changes that have occured to one database onto a
separite database.

I wish it were.

What is happening is a process that is as complex as file locking in a
multi-user database. The problem is, you can not do anyform of file
locking, or even MGA, because at the time that a change is taking place
on one server, there is no way for the other servers to know that it is
happening.

For example,

a customer calls in and gives his change of address, employee A gets
thier call, takes down the information and prepares to process it after
the current work is finished. Meanwhile, the customer calls in again
with a clarification, employee B gets the new data, see's it has not
been entered and begins entering it. Employee A and Employee B are
working on different servers. When Employee A starts to enter the
changes, the information from Employee B has not been posted, so the
information has not been replicated to Employee A's server. Both
employees are working on the same record, at the same time, without
either server producing a block because both servers are not aware of
each other.
This can be extended to two new records being entered at the same time
or even deleted from one server while another is updating the record in
another.

So with this in mind, lets review the different approaches to
replication.

Failover replication

With this style of approach, users and bots only work from one core file
at a time.
Secondary core files are updated by dedicated bots but it is one way
only.
This is the easiest form of replication to set up and it is best to
think of it as being a remote shadow.

Shared Replication

Depending upon the application, you may find that while all users may
query all the tables in a database, only one or two users actually
modify data in certain tables.
An example of this would be a table that holds large batches of
information (like that generated by outside services or machines, with
us, it is client transactional data from Bell). If you can segregate
your users into those who modify this set of tables vs users who modify
that set of tables, you can split your users onto two or more servers.
You can even have you applications understand the split and connect to
the appropriate servers when a operation starts.
Although all servers contain the same data, you have balanced the load
(especially if you have alot of triggers and server side verification
going on).
This method allows you to use a slightly modified version of the
Failover method, with the idea that it is one way replication split
across tables.

Full Replication

With Failover or Shared Replication, you can normalize your activities.
for example, if in a log, a new record gets created, then updated
multiple times, then deleted, all within the span of the log, then the
BOT would not apply anything related to that record. You can use this
method to cut down the network traffic so that only the final state the
data is in, is applied to the end database.

With full replication, this is a problem.
You may have a record that was in one core database, that conflicts with
a record in the destination database (such a a unique key conflict
during a certain time period).

As long as your updates are frequent, this sort of problem is minimized
but it does not go away.

The other problem is, the log from the remote database assumes the data
is in one state but it has been already modified locally.

This all comes down to the fact that not only do you need what has
happened, you also need to know what it happend to (if you update a
record, what was the values before the update).

You also need some form of conflict managment and reporting.

If you remember my description of core files, you will note that I
declared one core file as the primary file. This is because, all
replication occurs to it first. It (via stored procedures and dedicated
bots) performs all work on the data and if a secondary core file gets
out of whack, it will produce the instructions to get it back into
shape.
This is very application specific.
When I am finished my 10000' overview, I will produce a example
application.
(ARE THERE ANY OLD ROLE PLAYING GAME FIENDS OUT THERE? - I AM PLANNING
ON A AUTOMATED SYSTEM FROM A GAME CALLED TRAVELLER - IF YOU ARE
INTERESTED IN HELPING, LET ME KNOW AND WE WILL BUILD THE EXAMPLES
TOGETHER)
Alot of this comes down to the developer understanding thier application
and how the tools work.

So, basically, if you are only changing the data on a single table on a
single server, your design is much simpler and easier to implement.
If you want full failover/load balancing without regard to the tables or
servers used, then you need to design your record conflict management
routines.

The other issue with replication is, size of your data and time it takes
to insert.
You may find that you need to compress your data before transmission and
then uncompress it when you get it to the destination machine. This is
not that difficult(libraries for almost every compression algorythm
under the sun are free for use and download on the web), what is slow is
the actual inserts. This is where Classic arch shines. Data inserts
under classic, if performed on the server itself, are very fast. Your
app, as long as it goes directly to the GDB, does not go through the
networking layer or any of the other layers involved. With classic, you
can compile the equivalent of gdb_server into your app. If you watch
gbak on a classic box, you will note that gbak does all the work and no
gdb_server process is spawned. With superserver, you do not get this
benefit (at least as far as I have seen)

Replication is one of the area's of database design where one persons
solution is perfect for them and useless for somebody else. I will do
my best to show all the different examples in my project (I expect to
miss one or two) and I am hoping that I get enough questions to keep me
on my toes so that I don't miss anything.

In order for us to properly use replication, we resorted to a UID that
would survive backup/restore and be unique across the entire database
schema. This allowed us to identify records even when the data that
comprised their primary key had changed.
It also allowed us to identify what server created a record and a order
of operations when it came to identifying conflicts. We also use the
UID because we use many-many relationships and thus we needed a UID that
would work well in recursive structures.

Best regards

Dalton

-----------------------------------

Ok, I think that I will give Helen a chance to look over the last weeks
postings and come up with a format that is usefull.

With that in hand I will try to fill in all the blanks.
In the meantime, I will be drawing up some charts and diagrams and
working with a friend on some template code.

PLEASE, ask questions. Keep them on this thread instead of sending them
directly to me.
That way, all the questions can be put into a FAQ and I can make sure I
have not left alot out.