Subject Re: [firebird-support] Deadlock when 2 session update the same 01 row
Author Ann Harrison
On Wed, Nov 28, 2012 at 9:42 AM, André Knappstein <
Knappstein@...> wrote:


> FWIW there seems to be a related question in the .net list.
> Jiri confirmed that the default is "NoWait". This makes my
> misunderstanding of the term "deadlock" complete, because in NoWait I
> usually get an exception right off, which is all but a deadlock in the
> way I used to understand it.
>

There's a some history and a bit of theory here.

The history is that in the early 1980's DEC had two groups writing
relational databases. One group used traditional record-lock based
concurrency control and one used MVCC. Guess which one Jim Starkey led,
and for extra points, how there happened to be two projects. At the time
DEC was pushing VAX clusters, which were groups of separate machines that
shared storage through a thing called the HSC. Part of the VAX Cluster
architecture was a distributed lock manager. Very clever bit of software,
which made locking possible in shared nothing, but locks were both slow and
available in limited quantities. Anyway, that's why a non-lock based
solution was attractive.

Having competing projects has some benefits - easy to do performance
testing side by side, and develops some real identification with the
project. Releasing two relational databases and making customers choose
between them was less attractive, so a rule emerged. The interfaces had to
be identical. Not just the API, but the system tables, and the error codes
- everything had to be consistent, so you could take an application running
against one database, change the logical name that lead to the database
shared library and magically you'd be running the other database. And, of
course, a common backup format so you back up one and restore as the other.

Error codes were a real trial. Some errors only occur with record-locking
system (e.g. out of locks) and some only with MVCC - update conflict. The
two project leaders and I met with a mediator, David Hartzband - he has a
doctorate in philosophy from Heidelberg (the one in Germany), is a database
expert, and worked as a bouncer at the Buckets of Blood bar outside the
gates of a steel mill in either Pittsburg PA or Youngstown Ohio. I forget
which. A man who could be very convincing. Errors were divided into
groups by the way a program could deal with them. Codes common to both
system and the top code for a group of similar codes became primary error
codes. Codes that were specific to a database were secondary codes.
That's how Update Conflict got to be a secondary code under Deadlock.

The theory is that if you encounter an update conflict, you're going to see
that same conflict if you retry your transaction before the other
transaction commits. That's pretty much live-lock: try, fail, try again,
fail again, use all the system resources and don't get anywhere. That's
why the original default was WAIT. If you wait, then there's a chance your
next attempt will succeed. Yes, it's a problem if the other guy is running
long update transactions (why would he do that?) and you've got a user
waiting for an answer. Lots of interfaces use NO_WAIT as the default.

Good luck,

Ann


[Non-text portions of this message have been removed]