Subject Re: [Firebird-Architect] Group Commits
Author David Jencks
This idea works really well, even in java. The ObjectWeb folks have
written HOWL (High-speed ObjectWeb Logger) for the persistence part of
an XA transaction manager using this idea. I don't remember the latest
figures very well, but I think on cheap hardware we are seeing
somewhere around 20,000 writes/sec (10,000 xa transactions/sec). I
could be wrong, but I think the force interval is more like 10 msec
rather than 200. On the other hand, we aren't really expecting much
other disk access.

david jencks

On Nov 6, 2004, at 8:34 AM, Jim Starkey wrote:

> The gating factor in readonly Vulcan transaction performance are the
> cyclical writing of the header and transaction inventory (tip) pages.
> The superserver solution is sufficiently unpalatable that some
> rethinking is in order, which is pushing me in the direction of group
> commits. I'm sure other people have been thinking about this as well,
> so I thought this would be a good time to exchange thoughts.
> To re-cap transaction order of battle:
> 1. A transaction is started by taking out a transaction id from the
> database header page
> 2. A transaction may make some updates. Let's call the pages
> affected primary update pages.
> 3. A transaction may perform some garbage collection. Let's call
> the
> pages affected secondary update pages
> 4. A transaction is committed by first flushing all its dirty pages
> (both primary and secondary) from the page cache, then updating
> and writing the tip to reflect the new transaction state. Only
> after the tip as been written (quote safe on oxide unquote) is
> the
> user notified that the transaction is complete.
> Dirty cache pages are currently associated with a transaction with a
> mask of transaction id modulo 32 of transactions that have dirtied the
> page since it was read or last written.
> For the purpose of analysis, we have to consider the state of the
> database on disk if the engine is suddenly stopped at any point. We
> assume that a page has been written by the operating system when it
> returns from the write call. We know this isn't always true, but
> there's little we can do about it.
> For readonly transactions (transaction that perform no primary update
> pages), there is no reason to force either the header or transaction
> inventory pages to disk. They'll get written sooner or later (or maybe
> never), but this doesn't make any difference. A readonly transaction
> can safely piggyback on a later readwrite transaction. It would be
> nice, however, to eventually flush any secondary updates pages.
> Readwrite transaction must write the header page before writing any
> primary update pages, and must write all primary update pages before
> writing the tip.
> The obvious solution to group commits is a commit thread that
> periodically wakes up, checks for transactions pending completion,
> updates the header page if dirty, flushes dirty pages belonging to
> transactions pending commit, then updates the tip. All pending
> transactions are then committed. A plausible commit cycle would be
> four
> or five times a second, but would, of course, be settable by
> configuration file. Lets call it five times a second for discussion.
> The scheme means that the header and tip pages would be written at most
> 5 times a second without regard to the number of transaction that
> transpire. The scheme reduces both the header and tip hot spots as
> well
> as any operational hot pages (index, generators, reads used as locks,
> etc) by reducing page changes by many transactions to single physical
> write per page.
> The scheme seems to work equally well in superserver, classes, and
> mixed
> environments.
> The implementation would be a class CommitManager with a single
> instance
> hanging off the Database object. TRA_commit, et al, would call the
> commit manager to perform cache flushing and tip management. With
> group
> commit turned on, a commit would add the transaction to a list, "or"
> the
> transaction mask to a group flush mask, and wait for a wake up. When
> the group commit thread wakes up, it writes the database header page if
> necessary, uses the group flush mask to flush the cache, updates the
> tip, then wakes up the snoozing transaction threads.
> It seems so simple and obvious I can't imagine why it hasn't been done.
> Have I missed something?
> [Non-text portions of this message have been removed]
> Yahoo! Groups Links