Subject Re: Corrupt header problem
Author polydwarf820
> Your analysis is pretty good, though I would like to make some
> suggestions about reporting problems. First, specify the version
> of Firebird or InterBase you're using. Different versions have
> different problems. Second, give exact error messages - even if
> that means pressuring your clients to say something more than
> "my database is corrupt". Third, encourage your clients to backup
> their databases regularly.

1. We're using the latest OpenSource Interbase (6.0.1.6 Open
Edition)... Moving to Firebird is definitely an option in the
future, however for right now, we're trying to fix the bugs we have
(The devil we know, versus the devil we don't, and all that..
Possibly in the new IBX rewrite of our software).
2. Exact error message is, when trying to open this database with a
client (IBConsole is what gave us the following message), "IO Error
for <filename>. Error while trying to read from file. Reached end
of file."
3. The customer had a "backup" system in place, however it consisted
of a nightly backup... to the same media (We didn't know this until
we tried to restore from backup, and then they told us what their
backup procedure was. Of course, the file was corrupted when they
did their last nightly backup). We're in the process of informing
their IT controller about good backup procedures.

> That said, your analysis is enough to tell me what happened - with
> a 90% probability. You're running on a windows operating system and
> you're using different connect strings for different clients. That
> causes InterBase V4, V5, & V6 to attach the database once for each
> connect string, breaking the lowest level of concurrency control -
> the part that maintains the on-disk consistency.

After discussing more with our support people, I've found that we are
definitely doing this.
All of the "remote"/"client" workstations (IE machines that the
server is not on) are using the same connect string (As set by our
installer, which does do a proper connect string, ie
SERVERNAME:c:\data\database.gdb).
However, the machine that the server does reside on, the connect
string is just c:\data\database.gdb... No SERVERNAME, etc. This has
been remedied this morning in our install program, so that the server
machine's connect string for our application is exactly the same as
the client's connect string.

> The best solution is to convert to Firebird RC2, which will detect
> that problem and prevent the corruption. The next best solution is
> to move your databases from Windows to Linux (or Solaris, or HP, or
> any grown-up operating system). The third best solution is to be
> absolutely religious about connection strings - including the
strings
> used by the system administrator with IBConsole.

Solution 1 is something we'll be discussing. However, our concern is
when Interbase and Firebird diverge enough that components to access
one won't work to access the other (Specifically Interbase Express).
If there are going to be IBX components to do the same job, with at
least the same performace, then we're fine.

> The "end of file" error occurs because the system is looking for a
> transaction inventory page (80% probability) that did not get
written.
> If you increment the next transaction to some arbitrary value with
> a hex editor, you may have made the system think that it has more
> tips than were actually created.

We were getting the end of file error, before we started hex editing
the header of the gdb file.
When we started playing with transaction counters, we were
decrementing them, not incrementing them, on the assumption that if
we could get back to the last "good" transaction, we could then use
the gdb file from there, with the amount of data that could be "seen"
and whatever was lost was the client's fault, because of their poor
backup procedures.
My question is, if all of the restore procedures that have been
outlined previously don't work, is decrementing the transaction
counters like this, to get to the last "good" transaction in the
file, a valid approach to take to repair this file and get the
customer back up and running in the short term (Probably with a
gbak/grestore after the gdb file can be opened, so we have a clean
gdb file)? Will there be issues that are not immediately obvious
(Aside from the loss of data that's almost sure to result)?

- Jason