Subject Re: [ib-support] strange SQL error - internal gds software consistency check - fou nd the log file results
Author Ann W. Harrison
At 12:29 AM 11/6/2001 -0600, Mark Meyer wrote:


>the database just crashed again a few minutes ago approx 11:50PM 11/05/2001.
>
>you mentioned above that the .log file may help in diagnosing the problem.
>i see some read errors that occured just before the bug check error.

OK. The send & read errors probably represent three clients that exited
without closing the connection to InterBase. That should not cause a
crash - the engine is well used to clients that go away in the middle of
a request.


>sunburn (Server) Mon Nov 5 19:01:11 2001
> Super Server/main: Bad client socket, send() resulted in SIGPIPE,
>caught by server
> client exited improperly or crashed ????
>
>sunburn (Server) Mon Nov 5 19:01:11 2001
> INET/inet_error: send errno = 32

Those two are the first client that exited ungracefully.



>sunburn (Server) Mon Nov 5 19:44:44 2001
> INET/inet_error: read errno = 131
>
>sunburn (Server) Mon Nov 5 19:47:15 2001
> INET/inet_error: read errno = 131
>
>sunburn (Server) Mon Nov 5 19:50:13 2001
> INET/inet_error: read errno = 131
>
>sunburn (Server) Mon Nov 5 19:51:44 2001
> INET/inet_error: read errno = 131

My guess, and it's only a guess, is that those four messages
come from the keep-alive which hasn't yet given up on a connection
that isn't really alive.

>sunburn (Server) Mon Nov 5 20:22:47 2001
> INET/inet_error: read errno = 9

And this is the third.


>sunburn (Server) Mon Nov 5 23:44:45 2001
> Database: /data1/tclick/db/graycon1105.db
> internal gds software consistency check (cannot find record back
>version (291))
>
>sunburn (Server) Mon Nov 5 23:44:45 2001
> Database: /data1/tclick/db/graycon1105.db
> internal gds software consistency check (cannot find record back
>version (291))

These two errors are much more serious and almost certainly unrelated
to the earlier messages. Guessing again, I suspect that you are not using
forced writes and the server crashed hard sometime in the past, leaving
a bad pointer. Here are ways to recover:

1) Backup the database with gbak using the -g (no garbage collect)
switch. If that works, restore that backup and everything will
be fine.

2) use gfix -v -f -n to examine the database - if it reports the same
errors, repeat the command, omitting the -n switch so gfix will
eliminate the bad pointer(s).

If neither of those works, make a copy of the database and try gfix -m
on the copy. That introduces the chance of losing data - perhaps a lot
of data.

If none of those work, ask here again.



Regards,

Ann
www.ibphoenix.com
We have answers.