Subject RE: [firebird-support] Fixing corrupt data
Author Scott Buckham
>>We have a client that was hit by power surges within the last month. We
>>backed up and restored to a new hard drive after a few attempts without
>>any extra switches (to my knowledge). The client continued to run for a
>>few days until it seemed to be having problems reading our license files
>>from the database (includes blobs). Deleting and reinstalling the
>>licenses seem to give the application life again for a few days only
>>before the same problem occurred.
>>
>>The following exception seems to occur periodically or when I attempt to
>>do a backup on the running database:
>>
>>at org.firebirdsql.gds.GDSException: I/O error during "CreateFile (open)"
>>operation for file "server"
>>Error while trying to open file
>>null
>> at
>>org.firebirdsql.jgds.GDS_Impl.readStatusVector(GDS_Impl.java:1698)
>> at
>>org.firebirdsql.jgds.GDS_Impl.receiveResponse(GDS_Impl.java:1651)
>> at
>>org.firebirdsql.jgds.GDS_Impl.isc_attach_database(GDS_Impl.java:290)
>> at
>>org.firebirdsql.jgds.GDS_Impl.isc_attach_database(GDS_Impl.java:252)
>> at
>>org.firebirdsql.jca.FBManagedConnectionFactory.createDbHandle(FBManagedCon
ne
>>ctionFactory.java:543)
>> at
>>org.firebirdsql.jca.FBManagedConnection.<init>>(FBManagedConnection.java:1
09)
>> at
>>org.firebirdsql.jca.FBManagedConnectionFactory.createManagedConnection(FBM
an
>>agedConnectionFactory.java:374)
>> at
>>org.firebirdsql.jca.FBStandAloneConnectionManager.allocateConnection(FBSta
nd
>>AloneConnectionManager.java:61)
>> at
>>org.firebirdsql.jdbc.FBDataSource.getConnection(FBDataSource.java:104)
>> at org.firebirdsql.jdbc.FBDriver.connect(FBDriver.java:275)
>> at java.sql.DriverManager.getConnection(Unknown Source)
>>
>>Eventually this problem was occurring so frequently that trading could not
>>continue.

>That particular part of the problem you must ask about in the firebird-java

>list. There, you'll get a clearer picture of what the Jaybird
>implementation of the backup service is actually trying to do when this i/o

>error occurs. Absent the Java interface, the finger would point at a
>backup file of the supplied name already being present in the specified
>directory; or of the directory path to where gbak is supposed to create
>the file being invalid.

We are not using the jaybird driver for this (only available in version
2.0). We are directly running the backup from gbak called as a batch file.
The executable is always in the same location and gbak allows a backup to
override an existing file (and was not an issue in this instance).

>>We attempted to install another new hardware to eliminate this as
>>a cause but the problem remains.
>>
>>gfix -v -full gives me 1 record level error, which gfix -mend -full cannot
>>fix.

>You can probably find the page containing the offending record using the
>free corruption analysis tool from www.ibsurgeon.com.

I'm having a look into this - the free analysis tool looks quite manual. I
didn't see anything that would "detect" a corruption.

>>
>>Doing a gbak -b -v -ignore -t -limbo would not restore 1st go but would on
>>subsequent attempts - this surprised me. I have reproduced this on several
>>occasions.

>Unless you were actually using transactions across multiple databases at
>the time, limbo transactions would be irrelevant to the current corruption
>issue. Limbo transactions don't occur under any other conditions.

>>If I added the -i switch I did not seem to have any problems.

>That tells gbak to ignore checksum errors, which is fine, since they are
>meaningless anyway. AFAIK, a checksum will be either corrupt or one static

>value.

-i tells gbak not to deactivate indexes during restore (-ig is to ignore bad
checksums). I tried this option because the restore was only complaining
when it was restoring indexes (as detailed below from original message).

>>
>>The exception I get from the
>>.
>>gbak: cannot commit index RDB$PRIMARY53
>>gbak: ERROR: internal gds software consistency check (wrong record length
>>(183))
>>
>>gbak: ERROR: internal gds software consistency check (can't continue after
>>bugcheck)
>>gbak: Exiting before completion due to errors
>>gbak: ERROR: internal gds software consistency check (can't continue after
>>bugcheck)
>>gbak: ERROR: internal gds software consistency check (can't continue after
>>bugcheck)

>You didn't finish the sentence. You should get metadata commit exceptions
>from a backup process. Or are you now talking about an attempt to
>restore? And, if so, I hope you were using the -r switch for the restore.

This was from a restore. It may have been -c instead of -r (but to a new
file). Is this a problem?

>>
>>At this stage we are creating the database structure from scratch and then
>>pumping the data back into the Main Server from a few files that were run
>>over 3 days of issues. Is this a satisfactory plan of attack? Can I be
>>confident that the database structure is all that is corrupt and that I
can
>>insert the data into it without corrupting the file again?

>OK, it's not at all clear what you're actually doing here. When you say
>"creating the database structure from scratch", presumably you mean you are

>running a DDL script. Then you say you are pumping data back from a few
>files...over 3 days of issues. What does this mean? Do 3 textfiles (or
>whatever) contain all of the data that you lost? At this point it's
>impossible to comment on this approach without some clarification.

I backup the server databases metadata only (gbak -b -m ). Then I use IBPump
with the current (corrupt) server as the "source" and the restored metadata
backup as the "destination". This seems to works fine but I am not sure If I
am just delaying a problem or whether I can be confident that it was just
the metadata that was corrupt.

>Basically, if your corruption has defeated the gfix/gbak routines, you're
>fairly surely looking at physical corruption. Now, if your account is
>correct - that the original DB was actually backed up and then restored
>onto a new disk - then what you are looking at is some kind of recurring
>physical (or logical) trauma that have been happening *since* that restore.

I thought that it was interesting that the restore would not work the first
time but would on subsequent attempts. I have reproduced this on the
database a couple of times. Is it possible that I could restore a corrupt db
or any sort?

>>
>>
>>Firebird version: 1.5.2
>>Operating System: Windows XP Professional
>>Jaybird driver: 1.0.1

>You're using a driver that is too old (more than 2 years!!) to be aware of
>Firebird 1.5 architecture, never mind the raft of bugfixes to the driver
>between then and now (current releases are 1.5.6 and 2.0.0). I think you
>need to get yourself over to the firebird-java list and ask whether some of

>your problems are attributable to that old driver. It's well possible that

>someone over there will be able to put a finger right on the button.

We moved to firebird at about RC8. 1.0.1 was the only non-beta version
available at the time and we haven't had any issues that have seemed driver
related. I have never seen this documented in any of the release notes. Is
it documented somewhere else? I will fix this ASAP.

>./heLen




------------------------ Yahoo! Groups Sponsor --------------------~-->>
Most low income households are not online. Help bridge the digital divide
today!
http://us.click.yahoo.com/I258zB/QnQLAA/TtwFAA/67folB/TM
--------------------------------------------------------------------~->>

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

Visit http://firebird.sourceforge.net and click the Resources item
on the main (top) menu. Try Knowledgebase and FAQ links !

Also search the knowledgebases at http://www.ibphoenix.com

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

Yahoo! Groups Links