Subject Re: [firebird-support] Database corruption (new instance)
Author Vlad Khorsun
> Hello Group,
>
> Back in late October, I reported an issue where a customer was
> experiencing regular database corruption (~3 times/week). It was
> always an error in one of several indices on a very busy table.
>
> Details here:
> http://tech.groups.yahoo.com/group/firebird-support/message/97962
>
> The solution offered there was to upgrade to 2.1. The customer was
> upgraded to 2.1 in early December and had been running for over two
> months without issue.

Sounds good

> Yesterday however, one of our automated tools started logging a
> failure from the database that at first glance appears to be very similar.
>
> Environment:
> Windows 2003 Server
> Firebird 2.1 Classic
> 16 CPU Cores
> FDB file ~2.5GB
>
> From firebird.log
> ---
> DBSERVER Mon Feb 09 11:49:07 2009
>
> Database: PRODUCTdb
>
> database file appears corrupt (E:\PRODUCT\DATA\PRODUCTDB.FDB)
>
> wrong page type
>
> page 295704 is of wrong type (expected 7, found 3)
>
> internal gds software consistency check (error during
> savepoint backout (290), file: exe.cpp line: 4034)
> ---
>
> I needed to run gfix -validate, but being a live database I didn't
> want to take everyone out unless absolutely necessary. I made a copy
> of the database (nbackup -L, file copy, nbackup -N, nbackup -F on
> copy). I then proceeded to validate the copy.
>
> As expected, there were a few (3 from memory) index page corruptions.
>
> From Firebird.log:
> ---
> DBSERVER Mon Feb 09 11:57:33 2009
>
> Database: E:\PRODUCT\ADAM\PRODUCTDB.FDB
>
> Index 1 is corrupt on page 295473 level 0. File:
> ..\..\..\src\jrd\validation.cpp, line: 1537
>
> in table SOMETABLE (215)

Never seen such error before. It means some index key is less than previous...
very strange


> DBSERVER Mon Feb 09 11:57:34 2009
>
> Database: E:\PRODUCT\ADAM\PRODUCTDB.FDB
>
> Page 295704 wrong type (expected 7 encountered 3)

I would like to fix such errors but i have no reproducible test case :(


> DBSERVER Mon Feb 09 11:57:34 2009
>
> Database: E:\PRODUCT\ADAM\PRODUCTDB.FDB
>
> Index 4 is corrupt on page 295704 level 255. File:
> ..\..\..\src\jrd\validation.cpp, line: 1454
>
> in table SOMETABLEREPLICATION (264)
>
>
>
>
>
> DBSERVER Mon Feb 09 11:57:34 2009
>
> Database: E:\PRODUCT\ADAM\PRODUCTDB.FDB
>
> Index 4 is corrupt on page 295704 level 255. File:
> ..\..\..\src\jrd\validation.cpp, line: 1468
>
> in table SOMETABLEREPLICATION (264)

This two errors because of page 295704 is not an index page


> DBSERVER Mon Feb 09 11:57:38 2009
>
> Database: E:\PRODUCT\ADAM\PRODUCTDB.FDB
>
> Page 297835 is used but marked free
>
>
>
>
>
> DBSERVER Mon Feb 09 11:57:38 2009
>
> Database: E:\PRODUCT\ADAM\PRODUCTDB.FDB
>
> Page 297895 is used but marked free

Also not seen this before.


> From gfix, I identified the table, dropped and recreated the foreign
> key constraints, and the issue is "resolved" (ie, the automated tool
> succeeds now).
>
> Some questions:
> * Clearly the upgrade to 2.1 resolved a lot of these index issues. Are
> there still known issues with indices on extremely busy tables? (this
> table flags PKs requiring replication to hundreds of remote devices,
> so it is not uncommon to manipulate hundreds of records per second in
> this table).

There is still some rare reports about

"Page XXX wrong type (expected 7 encountered N)"

with N mostly 5 (data page)


> * I observed something unusual with nbackup. After running -N, the
> delta file was left. Obviously it is left during the merge, but using
> process explorer I could see that there was no fb_inet_server.exe
> instances holding a handle. Any ideas?

Upgrade to 2.1.2 as soon as it released. It have some fixes for nbackup.
I don't remember exactly, probably even 2.1.1 have nbackup patch. Release
Notes definitely helps you.

Regards,
Vlad