Subject Re: Short or long transactions
Author burmair
--- In firebird-support@yahoogroups.com, "Adam" <s3057043@...> wrote:
>
> --- In firebird-support@yahoogroups.com, "burmair" <YahooSpam@> wrote:
> >...In terms of performance, is it faster to do as much as possible in a
> > single transaction (or maybe a few "chunks"), or is it essentially the
> > same if each update is performed in its own transaction?
>
> Use a separate transaction when the work is unrelated. Use the same
> transaction when the work must be atomic (all or nothing).

How do you determine what constitutes "unrelated"? If you fire a
rifle, in the local system you have recoil and loss of mass as
first-order effects, but you don't really care if or when the bullet
hits something. Of course, this second impacted system cares; it has
to deal with a gain in mass and kinetic energy, but depending on how
you look at it, the two systems may be related or unrelated. What
really complicates the situation is that the second system may in turn
deliver "events" to subsequent systems in a cascade, and if you want
to do something like trace a root cause, then everything that happens
is related. If you concede that the resulting volume of data
(potentially millions of updates) is too large in practical terms for
a single transaction, and you have no choice but to sacrifice
atomicity, then the question becomes, what's the most efficient way to
partition the transaction? And that leads to my real question: how
much overhead is there in a Firebird transaction? If it's negligible,
then you might as well use a transaction per update. But if it's
non-negligible, you have to start thinking about packing multiple
updates per transaction. But first I need to know what Firebird
parameters can I manipulate that might affect the transaction overhead.


>
> What factors
> > might make a difference? The size of the rows in question?
>
> For inserts obviously, updates it would depend on the amount changed.

This will affect the amount of work performed to actually store the
data, but (I suspect) will have no impact on the question of
transaction overhead. Is that incorrect?


>
> The
> > number of rows in the tables?
>
> Shouldn't

Are you saying that the number of rows has no affect on transaction
overhead? That's plausible, but again, I'm not concerned with the
amount of work required to actually store data in the DB.


>
> > The mix of update/insert/delete?
>
> updates and deletes create record versions which must be garbage
> collected at a later time, so only update or delete records that you
> want to (there was a post a few days back where someone was deleting
> records only to reinsert them).

I don't think that's an issue here. But I do have some control over
the ordering, if it's going to impact transaction overhead.


>
> > Cache size?
>
> Yes.

How will cache size affect transaction overhead? Will more or less
cache reduce transaction overhead, and how will that affect the actual
work performed within the transaction?


>
> >Forced writes?
>
> Yes, but if you care about your data this is non negotiable.

Let's say I don't care about my data. How will forced writes affect
transaction overhead?


>
> > Indices?
>
> Absolutely. Massive benefit to some selects, slight penalty for
> everything else so design with respect to how you anticipate the data
> to be used.

There are surprisingly few selects, but I'm guessing that a row update
is going to use the same internal mechanism that selects a row, so
indices will play a role. But the question is, will the indices
affect transaction setup and teardown?