Subject | Re: [Firebird-Architect] Interesting paper |
---|---|
Author | Ann W. Harrison |
Post date | 2010-09-07T15:13:03Z |
paulruizendaal wrote:
http://db.cs.yale.edu/determinism-vldb10.pdf
I'm trying to plow through it. One thing that I find really annoying
about academic papers on concurrency is that they equate isolation with
serializablility. MVCC transactions are absolutely isolated. In fact,
they're not serializable because they are so isolated that the actions
of concurrent transactions don't even make them wait, unless they try
to overwrite each other.
Serializability is a side-effect of applying enough locks to produce
repeatable reads. MVCC gets repeatable reads for free. The open
question (in my not so open mind) is whether serializability in and
of itself is a virtue. Read consistency is a virtue without question.
Without it accounts don't balance. But is it important that two
concurrent transactions start with an empty table, each execute this
statement table twice: INSERT INTO t1 (c1) SELECT COUNT(*) FROM t1
commit, and produce this content in t1: (0, 1, 2, 3)??
In MVCC, each of the transactions sees an empty table first, then
sees the one record it as stored, so the result is (0, 0, 1, 1).
Which is correct, but non-intuitive.
In lock-based concurrency, a more likely result is (0, 1) plus a
rollback. That happens when both transactions read the table, getting
a read lock on the end of the index, or on the table if there is no
index. The attempts to upgrade the locks from read to write is a
deadlock.
And, of course, if you really care that the results end up as
(0, 1, 2, 3), you can always serialize the MVCC transactions with
a unique index.
More when and if I understand the paper.
Cheers,
Ann
> There is some interesting discussion around a scale-out database paper over at highscalability:Here's the VLDB paper they reference
> http://highscalability.com/blog/2010/9/1/paper-the-case-for-determinism-in-database-systems.html
>
> To me it seems that this is just another middleware serialiser, but that probably means that I have not understood the paper very well.
http://db.cs.yale.edu/determinism-vldb10.pdf
I'm trying to plow through it. One thing that I find really annoying
about academic papers on concurrency is that they equate isolation with
serializablility. MVCC transactions are absolutely isolated. In fact,
they're not serializable because they are so isolated that the actions
of concurrent transactions don't even make them wait, unless they try
to overwrite each other.
Serializability is a side-effect of applying enough locks to produce
repeatable reads. MVCC gets repeatable reads for free. The open
question (in my not so open mind) is whether serializability in and
of itself is a virtue. Read consistency is a virtue without question.
Without it accounts don't balance. But is it important that two
concurrent transactions start with an empty table, each execute this
statement table twice: INSERT INTO t1 (c1) SELECT COUNT(*) FROM t1
commit, and produce this content in t1: (0, 1, 2, 3)??
In MVCC, each of the transactions sees an empty table first, then
sees the one record it as stored, so the result is (0, 0, 1, 1).
Which is correct, but non-intuitive.
In lock-based concurrency, a more likely result is (0, 1) plus a
rollback. That happens when both transactions read the table, getting
a read lock on the end of the index, or on the table if there is no
index. The attempts to upgrade the locks from read to write is a
deadlock.
And, of course, if you really care that the results end up as
(0, 1, 2, 3), you can always serialize the MVCC transactions with
a unique index.
More when and if I understand the paper.
Cheers,
Ann