firebird-architect - Re: [Firebird-Architect] Re: Special Relativity and the Problem of Database Scalability

Subject	Re: [Firebird-Architect] Re: Special Relativity and the Problem of Database Scalability
Author	Jim Starkey
Post date	2010-01-28T21:31:52Z

unordained wrote:

>> I don't think that works. Tx 2 is inconsistent at commit time, which
>> is a no-no. True, the global database is consistent if tx 2 would be
>> allowed to commit. That said, there is a risk that a Tx 3 that saw
>> the Tx 2 commit before the Tx 1 would see an inconsistent view of the
>>
> database.
>
>> The this brave, new, Einsteinian world, there is no fixed cross node
>> ordering of events in the absence of a constraint. Constraints must
>> be enforced around* the time of an update, and continue to be true
>> until commit if the transaction can be allowed to commit.
>>
>> * the timing of constraint violation is more a function of mandatory
>> SQL error reporting, though I supposed almost anything could be left
>> as a last minute surprise. I prefer to have a statement fail with a
>> proper diagnostic, but maybe this isn't required by theory. Still,
>> getting a referential integrity error from a commit is unsettling.
>>
> ------- End of Original Message -------
>
> There's a difference, I think, between saying that you can allow multiple
> transactions (on different nodes) to commit simultaneously (which many systems
> don't allow, so that's already a bonus) and saying that you can start a
> transaction viewing changes made by a transaction that depend on changes made
> in another, invisible transaction. I don't know that this implies the converse
> of your statement, however (in the presence of constraints, order must be
> preserved) -- as long as a transaction starts with a consistent view of the
> database (even if not the "true" one, whatever that means), that's good enough?
>

Let's look at some specifics. Tx1 and Tx2 each try to update the same
record. This can be performed consistently by each going to a
(hypothetical) resolution agent that can say, "sure, go ahead" or "nope,
your conflicting with tx <n>." This is, in effect, what happen inside
of Firebird (et al). So Tx 1 can't see Tx2, an engine can feel its
presence.

Resolution agents are a necessary component of the architecture. A
pending tx can register a dependence on a master record, for example,
and the resolution agent from prevent concurrent transactions from
deleting the master record for the during of the pending tx.

There can't be consistency without communication. The trick is to
minimize the communication and, to the degree possible, allow it to
operate asynchronously. A pending transaction can make a tentative
update and continue to execute pending a return message from the
resolution agent as long as it can rollback to the update statement and
return an error like a proper SQL database. This means, specifically,
that a mass update can have hundreds of pending resolution in flow, so
it needn't block until statement end and it has to return a proper
status code to the client.

> Yes, you can fix this loophole by saying that constraints must also validate
> inside the transaction itself, independently of other commits. Essentially,
> every constraint gets checked twice -- once within the transaction, once within
> the global scope. I'd call that transaction-consistency, rather than database-
> consistency: nothing "weird" can show up inside a transaction.
>

It's probably a useful optimization to do a local validation before
sending off to a resolution agent, but it isn't required. But if the
transaction is going to violate a local constraint, the resolution agent
is going to make it any better.

> Except deferred constraints allow just that. Defer an FK, and your transaction
> can see "weird" stuff, which it has to be willing to put up with. Queries can't
> assume that just because a constraint exists, the visible data will conform to
> it. Defer a PK, and your transaction can wind up duplicating a row, then find
> itself unable to determine which to remove to fix the problem. Cascade events
> get nastier with deferred PK's. I think it's safe to say not *all* constraint
> types are deferrable.
>
> It would be nice to ask to defer constraints, and also ask for their validation
> independently of a commit. Iterate over a task which, internally, must first
> put the database into a weird state, then fix it -- after each iteration, it
> may want to revalidate (some? all?) constraints before moving on.
>

The smart thing is to validate everything at statement time and
re-evaluate only the failures at commit time. But deferring blocking
for validation requests until commit time maximizes parallelism.

> If one is willing to let a transaction see a weird state (internally) with
> deferred constraints, should transactions have the option of starting in an
> inconsistent state, knowing that this is only a problem of visibility, which
> will be remedied by someone else? Sure, tx 3 can see a detail record with no
> parent, but it can also trust that said parent will exist, eventually. If
> queries can be written to survive weird states from deferred constraints, why
> not from this as well? (new transaction option: "require internally consistent
> view before starting")
>
>

No. With th exception of its own updates, a transaction must see a
consistent view of the database.

--
Jim Starkey
Founder, NimbusDB, Inc.
978 526-1376

[Non-text portions of this message have been removed]