Subject Re: [firebird-support] Re: FB suitability for Consolidated Database of 15 GB.
Author Helen Borrie
At 02:45 PM 9/10/2011, raja_s_patil wrote:

>We have decided to tryout IBPhoenix IBreplicator for this. For me also this is First Kind of Task. We thought that deciding would be Database size is First Step then checking which RDBMS will support is Next. Now since FB 2.5 seems to be OK so we will evaluate IBReplicator. We can download trial version and test the replication schema. If needed we can get queries solved in its Forum. Next 10/15 days we are going to tryout replication and performance testing of FB before commencement of Actual Project's Development.

That's a wise move. It is very flexible and I think you will find a solution to *your* requirements there. Without going into detail (wrong forum) two important aspects will be 1) that you have primary keys defined for all tables and 2) you keep a key on every replicated record that distinguishes the branch. Do join the Replicator forum and do study the help thoroughly.

>--- In firebird-support@yahoogroups.com, "samcarleton" <scarleton@...> wrote:
>
>> Have you come up with a strategy to replicate the brand databases to the HO? If so, have you validated that it will work?

You can trial IBPReplicator on InterBase. It supports IB, Firebird and Oracle. Note, these three (but not MSSQLServer) have multiple record versioning.

>
>> In my situation, we want to take data from multiple IB clients and put it into one central Microsoft SQL database. Microsoft has developed this very power and flexable system called Microsoft Sync Framework. Sync Framework can work with any RDBMS, assuming the RDBMS exposes enough info.
>>
>> Take a look at the Synchronization Example at the bottom of this link: http://msdn.microsoft.com/en-us/sync/bb821992 It explains the basic concept of how Sync Framework syncs the data. The whole key to it is the "Update Tick Count" which is database wide. In the Microsoft world is the @@DBTS function. They refer to this function as either the timestamp or as the rowversion.

This seems to imply that "any RDBMS" holds each specific record in exactly the same place on disk for its entire life. Amongst the non-desktop databases, MSSQLServer might be on its own in this simplistic assumption. In a MRV database, potentially one "global" version and any number of private versions might exist on disk at any moment. The next private version that commits successfully becomes the new "global" version that all new transactions subsequently see. At that moment in time, one or more of the co-existing private versions will (under almost all transaction configurations) still see a previous global version as the latest committed.

The most recently committed version won't be visible to those other transactions until their own transactions commit. It's not even as simple as *two* potentially different views of what is the latest committed. Transactions start and end constantly at random times. Meanwhile, the older and newer versions more than likely exist in different physical places on disk.

>> When syncing begins, the sync process needs to be able to get the "minimum active rowversion". This is the very last update tick count value used that has been commited.

A synch tool is a client. Like any client, it sees the latest committed version at the moment its transaction starts and continues to see that version until its transaction ends. (The exception here would be a transaction with Read Committed isolation, which would be suicidal for a replication/synchronisation scenario so we won't even think about it.)

The physical address of a record is stored on the record, in an accessible field called RDB$DB_KEY. The problem for your requirement is that RDB$DB_KEY is transient. You have no way to tell *how* transient. For the client, the RDB$DB_KEY (db_key for short) is sticky for the duration of a transaction. The client has the option to request for the scope of db_key to be connection-wide (which is another way to commit database suicide, as it inhibits garbage collection).

>>
>> In IB 7.5.1 I don't see any way to get at any value that could be used for this. There is a generator, but when you pool the generator, it always gives you the very latest value, even if that value is in an uncommited transaction. If you use that value as the start of the next sync, all the rows in the uncommited will never be synced. If you pick a time that is too far in the past, next time the sync happens there will be duplicates.

Confusion here. A generator value is always unique and won't be duplicated, even if it is generated and stored in a record version that never gets committed. You wouldn't have duplicates unless you replicated the same record twice with the same generated "record version" on it.

>>
>> Internally FB has to have this basic concept since the system works just fine when there are 15 active insert transactions and another transaction does a select, that select returns the 'safe' values. I am assuming that some concept along the lines of 'minimum active rowversion' is being used to determine what are the 'safe' rows to return.

The concept used by replicators for Firebird and IB is to keep a log of all changes, based on primary key, with timestamping. When the replicator starts a replication, it reads the log and replicates to the target tables in the exact order that the changes occurred on the source. The nett result of a replication is that the target records have the same "latest version" as was the "latest version" on the source when the transaction started. You don't get duplicates, you don't rely on any uniqueness criteria other than those intrinsic to the data and there's no use for a "time-ticker".

>>
>> So the question is: Does FB expose this type of information to make implementing a sync type of process easy or is this concept deep in the engine, never exposed, so one needs to hack the whole sync process?

You could look into RDB$DB_KEY to see whether it could give you anything you could use but it does not work like a ticker. It identifies a unique address on disk. It's not hacking; it has always been there and some tools actually show it to you on views and unkeyed tables. You need to get yourself a clear idea of just what it represents and NOT rely on it persisting beyond the bounds of your transaction. Myself, I don't think it's what you are looking for.

>> On the IB 7.5.1 project I am simply the developer trying to figure out Sync Framework, thus have little say in long term stradegy. For the other project where I am looking at FB2.5/3.0, I am in command thus I am starting to look at other options.

Good luck with the Sync Framework thing, which I don't think I actually "get", since the boundaries of a record's status in a multi-user environment are necessarily very dynamic...hence I can't grokk how real-time synchronisation could be achieved while other work was in progress. Replication can't be anything but a few steps out, from source to target or target to source, because that's the nature of the beast that is the multi-user database.

And to R. S. Patil, we have hijacked your thread and for this, I apologise.

heLen