Subject Re: [firebird-support] Firebird Embedded corruptions
Author Jan Flyborg
Hi,

First a sincere thanks to all of you for your answers.

We have all kinds of different corruptions and maybe do they not have the same root cause. Here I will give you three typical examples.

Example 1
This user complained that the system had stopped. Upon further investigation the following exception was found in our logs and when we received the database it was indeed corrupted.

Exception while executing job: NHibernate.Exceptions.GenericADOException: Error executing Enumerable() query[SQL: select sequencevo0_.recording_sequence_id as col_0_0_ from Recording_Sequence sequencevo0_ where  not (exists (select filesequen1_.SEQ_ID from File_Seq filesequen1_, Recording_Sequence sequencevo2_ where filesequen1_.SEQ_ID=sequencevo2_.recording_sequence_id and filesequen1_.SEQ_ID=sequencevo0_.recording_sequence_id)) and sequencevo0_.StopTime<@p0] ---> FirebirdSql.Data.FirebirdClient.FbException: database file appears corrupt (C:\PROGRAMDATA\AXIS COMMUNICATIONS\AXIS CAMERA STATION SERVER\ACS.FDB)

wrong page type

page 3819 is of wrong type (expected 7, found 3) ---> FirebirdSql.Data.Common.IscException: Exception of type 'FirebirdSql.Data.Common.IscException' was thrown.

   at FirebirdSql.Data.Client.Native.FesDatabase.ParseStatusVector(IntPtr[] statusVector)

   at FirebirdSql.Data.Client.Native.FesStatement.Fetch()

   at FirebirdSql.Data.FirebirdClient.FbCommand.Fetch()


Example 2
Here is another example of a corruption.

FirebirdSql.Data.FirebirdClient.FbException (0x80004005): unsuccessful metadata update
MODIFY RDB$INDICES failed
internal gds software consistency check (CCH_precedence: block marked (212), file: cch.cpp line: 4390) ---> unsuccessful metadata update
MODIFY RDB$INDICES failed
internal gds software consistency check (CCH_precedence: block marked (212), file: cch.cpp line: 4390)
at FirebirdSql.Data.FirebirdClient.FbCommand.ExecuteNonQuery()


Example 3
Here is a file that got sent in from a user that complained that his system was no longer working. For this file it looks like one table in the database has two records with foreign keys that refers to non existing primary keys in another table (which we have constraints for), so how this data has entered a transaction is somewhat of a mystery to us:

$ gfix -v -user SYSDBA -password masterkey acs_system_2014-06-17_16-17-33.737.fdb

$ gbak -b -g -user SYSDBA -password masterkey acs_system_2014-06-17_16-17-33.737.fdb out.fbk

$ gbak -c -user SYSDBA -password masterkey out.fbk restored.fdb
gbak:cannot commit index FKA6F4437CE96F23CD
gbak: ERROR:violation of FOREIGN KEY constraint "FKA6F4437CE96F23CD" on table "FILE_SEQ"
gbak: ERROR:    Foreign key reference target does not exist
gbak:cannot commit index FKA6F4437C58FFBFFC
gbak: ERROR:violation of FOREIGN KEY constraint "FKA6F4437C58FFBFFC" on table "FILE_SEQ"
gbak: ERROR:    Foreign key reference target does not exist
gbak:Database is not online due to failure to activate one or more indices.
gbak:Run gfix -online to bring database online without active indices.

If anyone is interested I can provide you with more details or even complete database files for further investigation. We have loads of corruptions like these three.

Best Regards
    //Jan Flyborg


2014-09-13 22:19 GMT+02:00 Alexey Kovyazin ak@... [firebird-support] <firebird-support@yahoogroups.com>:
 

Hi Jan,

You did not tell what kind of corruption you had (please provide full text of error). There are plenty of them, as well as reasons.
You also could use our tool FirstAID (Direct) to analyze database on low level and see where are the problems.

Regards,
Alexey Kovyazin
IBSurgeon (www.ib-aid.com)



 
Hi,

We have shipped Firebird Embedded bundled together with our product for a few years now and the system is currently in production at several thousand of our customer's sites. Currently we are using Firebird Embedded 2.5.1 with the latest .NET-driver and a stack consisting of Castle Active Record on top on NHibernate and the system is running on the latest versions of Windows.

All is well and Firebird has served us good so far with the exception of database corruptions that gets reported from a new set of customers every week. For some of them it is possible to instruct the customer on how to repair the databases themselves, but some of the databases are unfortunately so heavily corrupted that they need to be sent to us for repairing (which is a tedious work that steals time from other tasks). Most of them corruptions are normally found in the tables that gets the most writes, but I guess that is only natural.

We are now at the planning stage for the next major release of our product and we are thus rethinking if Firebird really is a good choice, because of this.

Lots of effort has gone into solving this problem on our side, so I think the normal prerequisites has already been put into place (e.g using forced writes and so forth), but our system needs to be up and running 24x7, which means that it is not possible to schedule periodic backup/restore cycles and my personal theory is that Firebird embedded gets corrupted over time if you are not doing this regularly.

So I have have a few questions that I would appreciate if someone could answer:

1. Is it feasible to run Firebird Embedded 24x7 in a setup where there are no scheduled backup/restore cycles. If not, how often should this be performed to ensure that the database does not get corrupted.

2. Most of our customers are not using a UPS. From my experiments I have not managed to create a corrupted database by turning of the power while doing a large set of writes (in a session running in VirtualBox). Could someone please confirm that this is indeed safe when you are running with synchronized writes turned on?

3. Are there any operations on a live database that should be avoided to minimize the risk of corruptions?

4. Just read a discussion about whether it is needed or not to call fb_shutdown to stop Firebird Embedded. Could this be the reason why we are getting corruptions? Should we change our service to perform this call when it is stopped?

5. I have also seen discussions of turning of automatic sweeps of the database (and doing them manually instead). Is this a likely source of corruptions for our setup?

Thanks in advance. Maybe are there no certain answers to my questions, but any pointers in the right direction would be very appreciated. Firebird has been a real workhorse for us and we would rather like to keep it.

Best Regards
    //Jan Flyborg