Subject | Re: [firebird-support] Database corruption |
---|---|
Author | Ann W. Harrison |
Post date | 2005-06-17T17:18:59Z |
innoy1k wrote:
with the blob?
record because the delete needs to read enough of the data to release
the record, its blobs, and any back versions.
server? When you were running gbak, was it trying to backup or restore
the database?
The checksum is validated immediately after a page is read from disk.
Checksums were abandoned in InterBase 4 and replaced with a simple
signature, 12345, that marks the page as plausible. The page read
returned something implausible for page 54805... (what size pages are
you using?) Gds_$get_segment is blob call - blobs were originally
called segmented strings (by a pompous bunch of up-tight marketing
people who considered the word "blob" vulgar). gds_$get_segment reads
the next piece of a blob. If it failed because of a checksum error,
then the blob is large (multi-page) and page 54805 should be a blob page.
say so, because I'm almost always wrong.
Here's what I would do. One variable is the use of the services API
through IBConsole. As you know, IBConsole is a Borland produce and the
fact that it works at all with Firebird annoys them a lot - so try
running the Firebird gbak utility, both on the server and on a separate
client system. There's a small but non-negligible chance that you've
got version skew in the message files and the messages you're seeing
aren't what the server intends to send. Running the gbak from a clean
1.5 installation should avoid that problem.
If gbak consistently reports an error on databases on the particular
server system and you can copy the same file to other servers and it
runs OK, then maybe, just maybe, you need an exorcist for the server.
Regards,
Ann
>Was there some specific hint that that made you think the problem was
> 1) gbak was run from IBConsole, identified the corrupted table_A. It
> was believed that the corruption was on the Blob field.
with the blob?
> 2) A program was written to repair the database by removing theHow did you do that? As a general thing, you can't delete a corrupt
> corrupted records.
record because the delete needs to read enough of the data to release
the record, its blobs, and any back versions.
> 3) ran gfix and gbak to the repaired database on a couple platforms:OK. At that point, everything was working.
> Win2000 and WinServer2003SE. Also ran gbak from IBConsole, all runs
> were successful.
> 4) sent the repaired database back to client's serverOK. How did you send the database back? How was it put on the client's
> (WinServer2003SE), ran gbak from IBConsole, and got errors straight
> away on table_A. The error is: Database file appears corrupt(); bad
> checksum; checksum error on database page 54805; gds_$get_segment failed.
server? When you were running gbak, was it trying to backup or restore
the database?
The checksum is validated immediately after a page is read from disk.
Checksums were abandoned in InterBase 4 and replaced with a simple
signature, 12345, that marks the page as plausible. The page read
returned something implausible for page 54805... (what size pages are
you using?) Gds_$get_segment is blob call - blobs were originally
called segmented strings (by a pompous bunch of up-tight marketing
people who considered the word "blob" vulgar). gds_$get_segment reads
the next piece of a blob. If it failed because of a checksum error,
then the blob is large (multi-page) and page 54805 should be a blob page.
> 5) reinstalled FB1.5 and IBConsole, ran gbak on the repaired database,OK.
> but failed again.
> 6) installed FB1.5 and IBConsole on another platform (WinXP) of thisAh ha. Where was the database? Still on the original server?
> client's office, ran gbak on the repaired database, the run was
> successful.
> 7) scanned the hard drive and its mirror disk on the server, it seenI always think unexpected problems are hardware problems. I try not so
> ok, changed a couple controllers, and disable mirror disk, then ran
> the repaired database, but failed again.
>
> In short, the checksum corruption is happening instantly. My questions
> are:
>
> Do you think this is a hardware problem?
say so, because I'm almost always wrong.
> Why no other corruption in the same machine?Dunno.
> What is gds_$get_segment?As above.
Here's what I would do. One variable is the use of the services API
through IBConsole. As you know, IBConsole is a Borland produce and the
fact that it works at all with Firebird annoys them a lot - so try
running the Firebird gbak utility, both on the server and on a separate
client system. There's a small but non-negligible chance that you've
got version skew in the message files and the messages you're seeing
aren't what the server intends to send. Running the gbak from a clean
1.5 installation should avoid that problem.
If gbak consistently reports an error on databases on the particular
server system and you can copy the same file to other servers and it
runs OK, then maybe, just maybe, you need an exorcist for the server.
Regards,
Ann