firebird-support - Re: [firebird-support] Fixing corrupt data

Subject	Re: [firebird-support] Fixing corrupt data
Author	Helen Borrie
Post date	2005-12-01T04:01:44Z

At 01:19 PM 1/12/2005 +1000, Scott Buckham wrote:

>We have a client that was hit by power surges within the last month. We
>backed up and restored to a new hard drive after a few attempts without any
>extra switches (to my knowledge). The client continued to run for a few days
>until it seemed to be having problems reading our license files from the
>database (includes blobs). Deleting and reinstalling the licenses seem to
>give the application life again for a few days only before the same problem
>occurred.
>
>The following exception seems to occur periodically or when I attempt to do
>a backup on the running database:
>
>at org.firebirdsql.gds.GDSException: I/O error during "CreateFile (open)"
>operation for file "server"
>Error while trying to open file
>null
> at
>org.firebirdsql.jgds.GDS_Impl.readStatusVector(GDS_Impl.java:1698)
> at
>org.firebirdsql.jgds.GDS_Impl.receiveResponse(GDS_Impl.java:1651)
> at
>org.firebirdsql.jgds.GDS_Impl.isc_attach_database(GDS_Impl.java:290)
> at
>org.firebirdsql.jgds.GDS_Impl.isc_attach_database(GDS_Impl.java:252)
> at
>org.firebirdsql.jca.FBManagedConnectionFactory.createDbHandle(FBManagedConne
>ctionFactory.java:543)
> at
>org.firebirdsql.jca.FBManagedConnection.<init>(FBManagedConnection.java:109)
> at
>org.firebirdsql.jca.FBManagedConnectionFactory.createManagedConnection(FBMan
>agedConnectionFactory.java:374)
> at
>org.firebirdsql.jca.FBStandAloneConnectionManager.allocateConnection(FBStand
>AloneConnectionManager.java:61)
> at
>org.firebirdsql.jdbc.FBDataSource.getConnection(FBDataSource.java:104)
> at org.firebirdsql.jdbc.FBDriver.connect(FBDriver.java:275)
> at java.sql.DriverManager.getConnection(Unknown Source)
>
>Eventually this problem was occurring so frequently that trading could not
>continue.

That particular part of the problem you must ask about in the firebird-java
list. There, you'll get a clearer picture of what the Jaybird
implementation of the backup service is actually trying to do when this i/o
error occurs. Absent the Java interface, the finger would point at a
backup file of the supplied name already being present in the specified
directory; or of the directory path to where gbak is supposed to create
the file being invalid.

>We attempted to install another new hardware to eliminate this as
>a cause but the problem remains.
>
>gfix -v -full gives me 1 record level error, which gfix -mend -full cannot
>fix.

You can probably find the page containing the offending record using the
free corruption analysis tool from www.ibsurgeon.com.

>
>Doing a gbak -b -v -ignore -t -limbo would not restore 1st go but would on
>subsequent attempts - this surprised me. I have reproduced this on several
>occasions.

Unless you were actually using transactions across multiple databases at
the time, limbo transactions would be irrelevant to the current corruption
issue. Limbo transactions don't occur under any other conditions.

>If I added the -i switch I did not seem to have any problems.

That tells gbak to ignore checksum errors, which is fine, since they are
meaningless anyway. AFAIK, a checksum will be either corrupt or one static
value.

>
>The exception I get from the
>.
>gbak: cannot commit index RDB$PRIMARY53
>gbak: ERROR: internal gds software consistency check (wrong record length
>(183))
>
>gbak: ERROR: internal gds software consistency check (can't continue after
>bugcheck)
>gbak: Exiting before completion due to errors
>gbak: ERROR: internal gds software consistency check (can't continue after
>bugcheck)
>gbak: ERROR: internal gds software consistency check (can't continue after
>bugcheck)

You didn't finish the sentence. You should get metadata commit exceptions
from a backup process. Or are you now talking about an attempt to
restore? And, if so, I hope you were using the -r switch for the restore.

>
>At this stage we are creating the database structure from scratch and then
>pumping the data back into the Main Server from a few files that were run
>over 3 days of issues. Is this a satisfactory plan of attack? Can I be
>confident that the database structure is all that is corrupt and that I can
>insert the data into it without corrupting the file again?

OK, it's not at all clear what you're actually doing here. When you say
"creating the database structure from scratch", presumably you mean you are
running a DDL script. Then you say you are pumping data back from a few
files...over 3 days of issues. What does this mean? Do 3 textfiles (or
whatever) contain all of the data that you lost? At this point it's
impossible to comment on this approach without some clarification.

Basically, if your corruption has defeated the gfix/gbak routines, you're
fairly surely looking at physical corruption. Now, if your account is
correct - that the original DB was actually backed up and then restored
onto a new disk - then what you are looking at is some kind of recurring
physical (or logical) trauma that have been happening *since* that restore.

>
>
>Firebird version: 1.5.2
>Operating System: Windows XP Professional
>Jaybird driver: 1.0.1

You're using a driver that is too old (more than 2 years!!) to be aware of
Firebird 1.5 architecture, never mind the raft of bugfixes to the driver
between then and now (current releases are 1.5.6 and 2.0.0). I think you
need to get yourself over to the firebird-java list and ask whether some of
your problems are attributable to that old driver. It's well possible that
someone over there will be able to put a finger right on the button.

./heLen