firebird-support - Re: [firebird-support] Rec Ver / Updates and Deletes

Subject	Re: [firebird-support] Rec Ver / Updates and Deletes
Author	Ann W. Harrison
Post date	2006-06-20T16:30:52Z

Johannes Pretorius wrote:

>
> I have been thinking about rec versioniong and transactions and am I correct to
> say that new records of the data is retrieved after a transactions is
> commited not when started ?

I'm not sure I understand what your asking, but let me explain what
I do understand. When a transaction - and all operations in Firebird
occur in transactions - inserts a record, the record is placed on a
database page in the server cache. The record as a header that
includes the transaction id of the transaction that inserted it.

When the cache page is written to the database, the record goes
with it. The page may be written for one of several reasons. One
is that the transaction that inserted the record commits. Before
the commit completes, all changes made by the transaction are
written to disk. Another is that a different transaction wrote
a record on the page and that transaction committed, causing the
new page to be written with all the changes on it - committed or
otherwise. A third reason is that the cache is full and that
page is the least interesting, so it gets flushed to disk.

So, a record in the database may be committed or not. The server
keeps track of the state of "interesting" transactions. When a
transaction asks to read a record, the server checks whether the
record was created by a transaction whose results are legitimate
for the current reader. The definition of "legitimate" depends
on the type of transactions. A concurrency transaction sees only
records that were committed when it started. A read-committed
transaction sees all committed records. If the record is OK,
the system returns it. Otherwise, it checks for older versions.
For a newly inserted record, there are no older versions, so
nothing is returned.

When a transaction updates or deletes a record, it creates a new
version on a data page in the cache. The new version is stored
on the same page as the previous version. If there's room for
both, the old and new version are on the same page. If not, the
old version is moved. The page can be written from cache for
any of the reasons above.

When a transaction tries to read a record that's been updated
or deleted, the system checks to see if the new version is
legitimate for the current transaction. If not, it checks
the next older version. If that's not legitimate, the system
checks the next older, etc., returning the first legitimate
version or nothing.

When a transaction tries to update a record, the system first
checks that the most recent version of that record is legitimate
for that transaction. If there are changes to the record that
the updating transaction can not see, the update fails with an
error like "Deadlock - update conflict".

>
> The following is based on a transaction with NO Rec Versioning.
>
> Then also what I want to know is what happens in this scenario.
> We have a client (A) that connects to the database and has query
> on a table (tbl_x). Then we have another client (B) that runs
> deletes in intervals on the same table. obviously client B's
> version of the data is correct most of the time.

That's less obvious than you might think. If client A is a
concurrency transaction, then it sees a consistent snapshot
of the database showing the committed data when A started.
If A runs the same query twice, it gets the same results.
That's considered "correct" in database circles. One reason
is that B may fail, and since B's transactions are atomic,
all its change disappear. In database circles, it's
considered quite "incorrect" to read data that later
disappears.

> Now client A does updates to tbl_x every now and
> again thanks to certain factor. If the time has
> come for A to start it's update on tbl_x. But
> now the version is incorrect as some records has been
> delete from the table by B that A is not aware of.

Assuming that A is running a single transaction in
concurrency mode, A will get an update conflict error
when it tries to update a record that B has deleted.
If A is running a single transaction in read-committed
mode and B has committed, the system will recognize that
the record has been deleted and A will not get a chance
to update it. If B has not committed, A will get an
update conflict error.

> WHAT will the effect be on this , regarding indexes ?

Indexes follow the same rules.

>
> Am I correct to say in the above sceanario that one must
> rather do a transaction start and commit , before
> one refreshes one's query ?

A concurrency mode transaction cannot "refresh" a query -
the rules say that a query gets the same results every time,
regardless of what else is going on. You can "refresh"
results of a read-committed transaction, and, assuming
that all changes made by other transactions are committed,
you can then update those results.

No transaction can overwrite the changes made by a
concurrent uncommitted transactions, nor can any
transaction overwrite changes it cannot read.

Regards,

Ann