Subject Re: [firebird-support] Database synchronization
Author David Johnson
Personally, I have never been comfortable with delta based
synchronization on a schema that was not designed specifically to be
distributed. There are too many opportunities for data loss (read
Barings Bank, Enron and Worldcom).

In my opinion, the big issue with distributed RDBMS is coming up with a
data model, process model, and primary keying structure that guarantees
uniqueness across platforms and allows databases to be merged without
conflict or loss of audit trail. These issues are independent of
operating system, DBMS, or data migration tool.

Our company had some really bad experiences with distributed systems
largely because they simply took a centralized system and tried to use a
delta based tool to merge databases over an IP connection. Most data
modelers and programmers have real difficulty in crossing the boundary
from RDBMS to DRDBMS.

I am partial to an "insert only" model that explicitly prohibits deletes
entirely, and allows updates only in the special case of invalidating an
old record (which substitutes for deletion). This way, you only need
to know when your last synchronization was, and synchronize records that
have been inactivated or inserted since the last synchronization time.

Conceptually, I use this structure for every table:

key_column (typically GUID)
Sequence_number (integer)
create_user
create_class
create_method
create_timestamp
invalidation_user
invalidation_class
invalidation_method
invalidation_timestamp
{business data ...}

I primary key on key_column + sequence number, and have further indexes
on create_timestamp and invalidation_timestamp since these columns are
used in virtually every query. I also have no problem with the idea of
having lots of audit trail information.

There are still issues with this - if two people update the same entity
while disconnected, and then they try to sync up, there may be two new
versions of the entity with the same ID and sequence number (in database
terms, each application instance invalidates the last record that they
have in common and inserts a new row in its private database instance
that has the same primary key as the other one).

However, there are other cues that can be used to recommend one record
over the other when reconciling databases during merge.

I keep the timestamps in UTC rather than local time so data can be
merged across time zones.



On Wed, 2005-03-02 at 10:18, Aitor Gómez González wrote:
>
>
> Hi all,
>
> I'm in process of designing a new application and need help with one
> of the requirements.
>
> Data from many customers is stored in a central (Firebird) database
> server. Customer should be able to manipulate his data
> (select/update/delete) from a web server via internet, and from his
> laptop computer, allowing him to work without internet connection
> against a local database. I cannot figure out the right way to get
> synchronized data between laptop and main server.
>
> Looking for information about this I found "FiBRE"
> (http://fibre.sourceforge.net/) and Sync4j
> (http://sync4j.funambol.com). Does anybody have experience with any of
> both, or can give any clue about the topic for further reading? Main
> server would probably be Linux, laptop clients will be XP.
>
>
> Thanks in advance,
> Aitor.
>
>
>
> Yahoo! Groups Links
>
>
>
>
>
>
>