firebird-support - Re: [firebird-support] Replicator pseudo-code (sorry, it's a little long...)

Subject	Re: [firebird-support] Replicator pseudo-code (sorry, it's a little long...)
Author	Jonathan Neve
Post date	2004-01-24T12:26:01Z

Alan McDonald wrote:

>>I think using one central log table is much better. This means,
>>obviously, that you won't have all the fields of each table in the log
>>table, but only the primary key(s). I guess this could be seen as a
>>disadvantage, but in practice, I don't see what problems it could bring.
>>
>>
>
>I agree - central log table with PKs works for me
>
>
>>Another thing is that I found in my experience that having a log
>>containing each change that was made to the database, indicating whether
>>it was an insert, a delete or an update, and then trying to blindly
>>perform the same operation on the remote database, is not a flexible
>>structure at all. I had started out like this, but I found that it
>>brought many problems. I know it's possible, but it's not flexible.
>>
>>Consider the following examples :
>>
>> - Suppose someone creates a record, realises that it was a mistake,
>>and deletes it right away. In the log you would have two lines. The
>>first would say that the record was inserted, and the second would say
>>that it was deleted. So the replicator trys to create an insert
>>statement for a record that doesn't exist any more. But it can't because
>>in order to make the statement, it needs the values from all the fields.
>>So this fails.
>>
>>
>
>It doesn't fail, since the select of the now non-existent record returns no
>values, so there is nothing to insert - this is not a failure. The delete
>will also not fail since delete from table where PK=4 when no PK 4 exists is
>a silent action as well - no error.
>

Well, it was in my case, because I constructed my query based on the
results of the select. But it's true that it's not too hard to find a
workaround. This was actually the last straw that made me change
methods, the other points are more important.

>> - Suppose a primary key value gets changed. The same problem as above
>>occurs. You have a line saying to insert a certain record, but by the
>>time you try to create that SQL statement, that record no longer has the
>>same primary key value, and so you can't find it. At first I thought I
>>could get around this problem by using another field called PKOldValue,
>>in which I would put the old primary key value every time it gets
>>updated. But this doesn't work either, because the initial insert still
>>couldn't take place.
>>
>>
>
>Primary keys should never be changed - this can be a long discussion but I
>never shange my PKs - it's a golden rule.
>

Of course. But in my case, I had a large database, which didn't conform
to this rule. So I had to make my replicator agile enough to cope with
it! :-)

>(auditable series is taken via a
>differnet route).
>
>

What do you mean by this?

>> - Suppose you have a trigger on one of your tables (Table A) that
>>inserts a record into another table (Table B). So you insert a record
>>into table A, and a record gets automatically inserted into table B. In
>>the log, you therefore get two lines, one for each table. But when you
>>replicate table A, the insert gets done automatically into table B on
>>the other server. When you then subsequently replicate table B, the
>>insert fails because it has already taken place.
>>
>>
>
>No - because every replication trigger recognises the user (e.g. REPL) and
>only inserts a log when the user is not REPL. All replicaton is done by the
>REPL user.
>
>

Yes, the replication triggers do, but that's not what I mean. What you
say is true : the changes applied by the replicator will not bounce back
to the original database, because of the replication user. But that
won't stop other triggers (which have nothing to do with replication)
from inserting records into other tables.

Suppose table A has a trigger (after insert or after update), which
inserts records automatically into table B. This happens all the time,
while people are using the database, quite independantly from the
replicator. Obviously, each record that gets inserted into table A
generates an line in the log table, and the same goes for table B. Then,
the replicator kicks in, and replicated table A first. In the remote
database, the same thing is going to happen : when I insert a record
into table A, a certain number of records are automatically going to get
inserted into table B. This has nothing to do with replication, that's
just the way I designed my database (admittedly not very well!). So then
afterwards, table B will get replicated, and all the same rows are going
to get inserted again. So it will either generate a PK violation, or, it
the PK values aren't the same, it will simply duplicate all the lines.
So I can't think of any other (nor simpler) solution to this problem
than to do as I did, that is, to not record the type of change that
occured, but merely indicate which record was affected. That way, in the
case I described above, we can simply notice that the records are
already present, and so we update rather and inserting. This is very
flexible.

>But there's lots of ways to skin a cat
>

Sure. I don't really mind how he does it (after all, that only concerns
him), and my method isn't necessarily the best. However, it does avoid
certain problems with the method that he proposed, which is why I
explained it. Naturally, these issues might not be a problem to him
because it all depends how your complex your database is. My replicator
was specifically designed to work with a rather twisted database --
triggers everywhere, firing at you when you aren't looking! A simpler
database would have been much easier, but by then, it was obviously too
late, I had to make the replicator without radically redesigning the
database.

Anyway, hope this helps!

Jonathan Neve.

[Non-text portions of this message have been removed]