Subject Re: [firebird-support] Firebird Embedded corruptions
Author Jan Flyborg

Thanks for this and sorry for my slow response. Please see my comments below.

Best Regards
    //Jan Flyborg

2014-09-19 21:13 GMT+02:00 Ann Harrison [firebird-support] <>:

On Mon, Sep 15, 2014 at 7:41 AM, Jan Flyborg jan.persson@... [firebird-support] <> wrote:

I just made another posting where I tried to describe three different examples of things we have seen.

The first was a wrong page type, which sounds like a bug that was fixed in a newer version in code that's common to all Firebird architectures.  In your case, the bad page was in an index (7).  If you can find the index with the bad page and recreate it, all will be well.

Just as an FYI, the page types are:
     0 -       undefined, normally an uninitialized page and indicates a bad page pointer elsewhere;
     1 -  Database header page
     2 - Page inventory page
     3 - Transaction inventory page
     4 - Pointer page
     5 - Data page
     6 - Index root page - contains information about each index on the table, one per table
     7 - Index (B-tree) page
     8 - Blob data page
     9 -       Generator pages

That sounds very good and it seems like an upgrade to 2.5.3 will make sure that we do not see this again.

The second problem (CCH_precedence: block marked.  file: cch.cpp line: 4390) is more concerning - I don't remember having read a bug about it.  CCH is the cache handler.  A "mark" is the sign that a page is about to be changed.   When Firebird is forced to write a page either as part of a commit or to free space in the cache, it must write out any pages that the page depends on first.  That's a little obscure.  Suppose that the page you're about to write has a record with a back version, and the back version is on a different page.  To keep the database consistent, the page with the back version must be on disk before the page that includes a record that points to the back version.  Firebird keeps a list of precedence relationships and CCH goes through them before writing a page.  I think the error means that someone is currently writing  to a page that's on the precedence list.  That should never happen.  It's interesting that the problem occurred during an alter index operation.  However, the database should be fine on disk and usable after you restart Firebird.  Page marks are entirely in memory.  It's quite possible that I missed a bug report and this problem was fixed in a later version.

If that is of any help for you, I was wrong in my original posting when I said we were using 2.5.1 (I mean that the line numbers in the exception might lead you to draw the wrong conclusion when I gave you the wrong version). We are currently using 2.5.2 and nothing else.

The third problem is two records in a referencing table lack mates in the referenced table, despite a referential constraint.  I have no idea how that happened, but it should be reasonably easy to fix in your database.

In another posting (later than yours) Fabiano is saying that these errors are connected to bad memory chips and in the future we will instruct our users who are having this problem to run memtest86 overnight to check that the memory is physically OK. These constraints problems are actually the most common that we see.
The first problem is what I would call a physical  corruption - the internal structure of the database is corrupt.  The second is an in-memory   corruption - the disk database is OK, but the in-memory version is damaged.  The third is logical corruption - the database is physically intact, but does not conform to the data rules..   

Typically we fix our problems with a gfix -mend and then doing a backup restore cycle. Usually some tables then still have problems (typically foreign keys that refers to non existing primary keys), so if possible we then remove the faulty records and then it works again.

Problem is that these are not my databases. I have normally no access to them since they are running in a standalone installation at our customers sites. Recently we have bundled our own homemade tool for repairing databases that our customer can use when they are experiencing problems (basically a graphical frontend for gfix), but sometimes this is not enough and the databases has to be sent to us.

Gfix is pretty old and somewhat crude.  IBFirstAid might give you better help on physical corruptions.  Checking that there is no non-conforming data before creating constraints may help with logical corruption.  

Yes that would probably be a better choice for us, but we cannot bundle IBFirstAId together with our application. Will however download it and try it on files to got sent to us.

Good luck (and my apologies for the late response)

No need for any apologies. I am very grateful for you taking your time to help us.

Another thing, what do you say about the posting above where the theory is that Volume Shadow Copy is interfering with the database? Have you heard about that before?

And another last comment. We have bundled Firebird with very many installations of our product and it might be the case that what we are seeing are very rare problems, that no one else has experienced before. Do you think we should post bug reports every time we see an exception or a problem that you have not already been made aware of? I mean the second problem in my list was new to you.