Subject RE: [firebird-support] Database corruption
Author Alan McDonald
> Hi all,
>
> We are having quite often -lately once a week- problems of database
> corruption. We are using Firebird 1.0.3 Version WI-V6.2.972. Once we get a
> corrupted database, we fix it it gfix, usually we lost data from one
> table -always the same table-, and after the fix everything seems
> to be OK,
> until we get again a corrupted database, after a few days it was fixed.
>
> Last time, I checked the server log and I have some questions,
> perhaps some
> of you
> could help me. These are the entries in the log before the problem came:
>
> ------------------------------------------
> NETPLCGRADER (Client) Mon Jan 19 06:44:19 2004
> Guardian starting: C:\Archivos de programa\Firebird\bin\ibserver.exe
>
>
> NETPLCGRADER (Client) Mon Jan 19 06:47:17 2004
> Guardian starting: C:\Archivos de programa\Firebird\bin\ibserver.exe
>
>
> NETPLCGRADER (Client) Tue Jan 20 06:44:07 2004
> Guardian starting: C:\Archivos de programa\Firebird\bin\ibserver.exe
>
>
> NETPLCGRADER (Client) Tue Jan 20 06:47:10 2004
> Guardian starting: C:\Archivos de programa\Firebird\bin\ibserver.exe
>
>
> NETPLCGRADER (Client) Wed Jan 21 06:44:10 2004
> Guardian starting: C:\Archivos de programa\Firebird\bin\ibserver.exe
>
>
> NETPLCGRADER (Server) Wed Jan 21 06:44:34 2004
> Database: C:\NETPLC\DATOS\GRADER.GDB
> internal gds software consistency check (cannot find record back version
> (291))
>
> ------------------------------------------------------------------
> ----------
> ---------
>
> Is it normal the two entries 'Guardian starting ...'??
>
> Has the message 'cannot find record back version (291))' any special
> meaning?? Could it be related with gbak? We do a backup of the database
> every day using gbak.
>
> Some days before there is also an error entry in the log: INET/inet_error:
> read errno = 10054. Could this have something to do with the corruption?
>
> After the database was fixed there were some new entries in the log, like
> for example:
>
> 'Relation has 403 orphan backversions (48 in use) in table
> MAINTCOUNT (128)'
> 'Chain for record 40 is broken in table ALARMS (135)'
> 'Data page 1831 (sequence 2) is confused in table ALARMS (135)'
> 'Page 1831 is an orphan'
>
> Just another question, is there any special datum in database statistics
> that could tell me that something is not OK?
>
> Thanks for your help,
>
> Trini.

10054 is often a network error (hardware) which suggests a PC and or the
server NIC is faulty or some other network hardware is causing outage.
You have C:\ suggesting WIN32 platform - Do you have Forced Writes set to
ON. Please ensure you do. This will often avoid the corruption problems
cuased by the hardware, but a good bout of hardware testing is in order, I
would say
Alan