firebird-support - Re: [firebird-support] Re: Database page error after migrating to 1.5

Subject	Re: [firebird-support] Re: Database page error after migrating to 1.5
Author	Ann W. Harrison
Post date	2004-10-13T15:51:50Z

Heiko,

Let me see if I understand what happened. You're seeing corruption
on one of 50 databases.

Are all the databases backed up every night?

Do all the databases have identical schemas?

Have all the databases been migrated from 1.0 to 1.5?

Did you backup and restore the databases as part of the migration?

Are you still using Dialect 1 on all databases?

How many times has this happened? Is there any pattern to
its happening?

>And I am still thinking about InterClient as a possible problem maker.
>Why? The affected database is the ONLY database which is accessed
>via "standard" client interface AND InterClient.

Neither InterClient - nor any other client - can corrupt a database.
Only the engine (i.e. the non-network part of the server) can do that,
because only the engine writes to the database file. If you give the
server bad data, the server passes it to the engine, which processes
it before writing it. Telling Firebird to store a 12 byte string
in an 8 byte field won't corrupt the database.

>A simple gfix -m ..., gbak -b -g ..., gbak -c cures the database.

The problem with a "simple" gfix -m is that you may well be
losing data.

Looking at your errors....

>diego Tue Oct 12 20:21:52 2004
> Database: /database/intern/transdata.fdb
> internal gds software consistency check (decompression
>overran buffer (179))
>
>diego Wed Oct 13 07:52:08 2004
> Database: /database/intern/transdata.fdb
> Data page 95007 (sequence 253) is confused in table
>TBLDESKVERSAND (432)
>
>diego Wed Oct 13 07:52:08 2004
> Database: /database/intern/transdata.fdb
> Record 213428 is wrong length in table TBLCFGVALUES (405)

These errors together suggest that page 95007 might be doubly
allocated - listed in the pointer pages of two different
tables. Reading records from one table as if they were records
of another is the most common source of wrong length record -
and decompression errors.

RDB$FORMATS is the internal system table that contains the
definitive description of table formats. When you change a
table - add a field, increase the size of a field - the engine
creates a new format version. Old records are tagged with the
old format version, new records are tagged with the new format
version. When the engine reads a record, it knows from the
format version how long the record should be when it is
decompressed.

Using a format for table A on a record for table B is not
going to work unless the tables are identical.

When the engine stores a record, it lays out all the fields in
order by field id, then performs run length compression on the
null flag bits and data. The record header and data are stuffed
into the data portion of the page. The index portion of the
page holds the offset and length of each record on the page.
When the engine reads a record, it reads the number of bytes
specified in the page index entry for the record. The record
header includes the record version number which tells the engine
how long this particular record should be when decompressed.

None of this helps your broken database, of course, but it
might help you look for problems.

If you have a copy of the corrupt database, you might try the
analysis tools that IBSurgeon provides. They give much better
diagnostics that gfix.

Regards,

Ann