firebird-architect - Re: [IB-Architect] SOME SOLUTIONS TO OLD PROBLEMS

Subject	Re: [IB-Architect] SOME SOLUTIONS TO OLD PROBLEMS
Author	dcalford
Post date	2000-06-20T01:34:10Z

Hi All,

After reviewing my first mind dump, I think I should clean up some of my work.

So, for those who have the patience and stamina to put up with another
diatribe from me,
here is my additions/replacements to what I am describing.

This is a sort of replacement for

<BIG SNIP> Different Methods of Backing up your data<SNIP>

> --------------------------------------------------
>

but not really ready for it. As I remember things and structure things, (or
get asked for clarification by readers) I change my style of writing.

I am trying to explain all the concepts (and pitfalls) but it is a large
subject and I know I will have forgotten to explain some detail properly. I
am also trying to set up my explanations so that a beginer to IB would be able
to understand the system. (so you may read about concepts you already take for
granted)

regards

Dalton
-----------------------------------------------------

To fully explain a few things, I need to show that my concept of a database
spans many servers and gdb files.

First off, I have a type of file that I refer to as a core file.
A core file has all of the users data.
There can be many core files and they are replicated amongst themselves.
They can sit on the same network or on different networks as long a some type
of update method is applied (even if that means applying roll-forward logs
that have arrived in the mail)
Of all the core files, one is designated as the primary file and this file has
the task of conflict resolution between the different core files.
The only difference between the primary file and the rest is a setting in the
variables table.
In this way, if the primary file fails, another core file can be upgraded to
become the core file when neccessary.

I have another type of file called the Historical files - this may or may not
apply to a developers needs. If the data lends itself to breakpoints or
static tables that do not relate to the newer tables, then, separating this
data into a subset of a core file may be the key. The historical files can
sit on any machine on the network, and should be on a machine that directly
relates to the amount of use they get (if they are only accessed by a single
person once a month or so, don't bother buying a Sun server to host them...)

I have a third type of file which is the roll forward logs. I set up the logs
so that they represent a week of time and are sized so they can easily be
placed onto a CD.

In addition to the different servers and GDB types, I have BOTS (Or IBOTS for
interbase bots) that assist in the maintenance of the system. I will come
back to what the bots are and what they do for you later in this series of
documents.
(Currently, I am working with a friend to produce a standard bot template that
is easily modified so that everyone out there can quickly use and/or customize
thier own bots and get up and working quickly)

Roll Forward Concepts and Implementation

What is Roll Forward? How does it affect my database?
Interbase is designed to hold large amounts of data that is shared at the
same time by a large group of people. As such, it is rare to have the
database shut down to allow the system administrator to make a proper external
backup of the data. If the database is not shut down when a backup is made,
at best the backup is corrupt, at worst, you have damaged your database. This
is caused by the backup process (either a third party tool or the copy
command) not understanding that only portions of the file are locked and that
they may be in a transitional state during the copy process (the file is
changing as the copy is taking place).

Interbase provides a backup utility called GBAK. Instead of physically
copying the file, it links into the database as a user and queries all the
information about the database. It then puts this information into another
file (or directly onto tape) in such a way that it can be physically copied or
moved. Since all this is within transaction control, you get a perfect
snapshot of the database when the transaction began. The problem is, what do
you do when your backup routine takes a long time. Or when your backup
regimen is once a day or even once a week. A file corruption could have you
loose hours or even days of work.

Once you restore your database, you then need to re-enter all the changes to
your database. Just finding out what has happened is time-consuming. It
would be great if every change to the database was in an SQL script that you
can run and have the database up to date . This is what "Roll-Forward" is
about, it allows you to easily rebuild a series of actions by applying a
single, user-editable process.

Interbase currently has no inherent method of doing this. (pre-super server
days, IB had this functionality, but it got broke and no body bothered to fix
it due to the difficulties involved) This does not mean it can not be done.
By using the triggers that are on a table, you can create a series of sql
statements that duplicate what has just occured.

For example, you can have a after-insert trigger on a table that inserts
values into a series of special tables that hold the changes and do the same
with the after update or after delete trigger.

A way to safely implement a roll forward system with interbase is to create
a IBOT that works on a timer. Every [preset] instance of time, the IBOT
queries the table to see all the newly commited changes and writes them out to
another database specifically designed to hold the logs (preferably on another
machine). Then, the IBOT removes the entries from the origional database.
This is handled nicely by using Jason's multi-database transaction components
(or directly going to the api). This way, the deletes only occur if the data
is properly written into the target database.

With proper timestamp values, you can take a backup, query the roll-forward
database(s) and apply the changes that have occured from the point the backup
transaction occured.

You can also find in the log where a user has made a mistake and roll the log
forward to that point, remove the users actions and continue the log as if the
user never performed the data damaging action in the first place....(this is
only one way of reversing a users actions - more on this later)

Replication and Load Balancing

What is replication? How does it compare to shadowing?
To replicate data is to copy changes from one operational database to
another operational database. There are different kinds of replication. One
directional replication occurs when the data is moved from a active server
that is server client information, to a non-active server which only recieves
the changes. When the primary server becomes corrupt and un-usable, the
secondary becomes active. This is fine for keeping a ready to run operation
going. Unfortunately, it does not help in the situations where a single
server can not process all the requests that are coming to it. Everyone slows
down and the system becomes un-workable. To combat this, a bi or poly server
environment can be implemented. If you have users login to different servers,
all with a copy of the data then no single server acts as a bottle neck.
You can even use a simalar method to keep information at multiple locations
and keeping them in sync.

Sounds wonderful doesn't it.
I wish.

Many people use simple numbers as there primary(surrogate) keys, these
numbers are provided by generators. A generator has no method of knowing
what numbers have been given to what record on another server.
This can cause invalid numbers, or index corruptions. Not a preatty site.
Due to the need to load balance the work of a data warehouse, many servers
may be working serving multiple users and producing calculated results that
encompass the work being performed by other servers. One of the best examples
of such a system is the fact that it is neccessary to create a unique
identifier for every record in a table. Since this identifier is generated by
the individual server, it is neccessary to make sure it is calculated in such
a way so that it identifies the very server it was created on. In this case, a
server wide variable called server name - lets assume it is a string that
holds 2 characters, is first used as the begining of the unique indentifier
(UID) , then because a server may have a second or more copies of the Database
on it, a second variable, database name, is also added. So far, the UID is 4
characters in length and it still needs further processing. Because tables may
be combined via union clauses or recursive tree structures, the unique
identifier must also be unique across all possible tables, so each table has a
unique shortcut name that is 4 characters in length. Now we are at the point
that most other designers begin with, a auto-incrementing number,
generally created by a generator and begins with a value of 1. The only
difference we start with is we cast the integer value into a string value that
is large enough to hold the largest value the integer can be. In this case 11
digits including the sign. To make sorting and other operations on the UID
easier, we will '0' fill the value after the cast so that it looks like this
001
002
003

instead of

1
2
3

This fixes the sort order so that we do not have sorts like this

19
2
20
21
22
....
29
3
30
31
etc.

With this structure, any number of users can insert any number of records in
multiple databases, across multiple servers, at the same time and not have a
duplicate UID assigned. This even works while the communication between
servers is not working and/or is a time delay update system. What we have
come up with is a UID that is 19 characters in length and looks like this

CHARS, 1-2 is a unique Server identifier
- ie, AA or AB etc.
CHARS, 3-4 is a unique GDB identifier
- ie, AA or AB etc.
CHARS, 5-8 is a unique TABLE identifier
- the same across all databases
CHARS, 9-19 is the actual number of the record ranging
from
-2147483647 to 2147483648 and is 0 formated
for example

AAAAAAAA-2147483647
AAAAAAAA00000000001
AAAAAAAA02147483648
etc.

I will be posting the SP code that does this.

This sort of planning is also required due to the fact that the data must
be replicated over many servers. If you use a simple replication strategy, a
single column can be used to demark whether a record needs to be copied from
one server to the next. ie select * from mytable where changed='YES' but, what
happens when you are updating multiple servers or using a ring or star
topology? It is extreamly important to designate what information is to be
replicated, how it is to be marked and to which servers it must be transfered
to. Over all, almost all further aspects of data design must be questioned if
they would hold up in a multi-server system. Another problem is duplicate
entries of the same data by different users into different servers at the same
time.
For example,
a customer calls in and gives his change of address, employee A gets
thier call, takes down the information and prepares to process it after the
current work is finished. Meanwhile, the customer calls in again with a
clarification, employee B gets the new data, see's it has not been entered and
begins entering it. Employee A and Employee B are working on different
servers. When Employee A starts to enter the changes, the information from
Employee B has not been posted, so the information has not been replicated to
Employee A's server. Both employees are working on the same record, at the
same time, without either server producing a block because both servers are
not aware of each other.
This can be extended to two new records being entered at the same time or
even deleted from one server while another is updating the record in another.
In order to ensure that the data is correct and not duplicated, each table
has a series of 'KEY' fields that allow for duplicate checking. Also, one
server out of the cluster, is responsible for maintaining the data integrity.
All the servers must have the code in place to do the maintenance, but only
one may be active at a time. This is handled by a value in the variables table
that the database checks upon.
So you can see, in order to allow load balancing, you must design your data
in a very convuluted and detailed way. Don't worry, it is actually very
simple if you do not loose the overall picture and get lost in the details.
In comparison, a shadow is a one way replication system that works on a
single server under the control of interbase. It is good if your data is on
one drive and the shadow is on another and one of the drives fail.

Non-Transactional Process Control Concepts and Implementation

When a user logs onto interbase, and makes a change, that change does not
affect others until it is commited to the database. But after it is commited,
it is difficult to isolate and reverse the users actions at a later date. In
order to fix this, every table needs a linked child table that holds all the
main tables changes. In this way, you can run a script and reverse all of a
single users actions across the database.
In order to prevent your database from becoming huge, you need to periodically
purge the data history.

------------------------------------------------------------------------
more to come................

best regards

Dalton