firebird-support - Re: [firebird-support] internal gds software consistency check (can't continue after bugcheck)

Subject	Re: [firebird-support] internal gds software consistency check (can't continue after bugcheck)
Author	Helen Borrie
Post date	2008-11-11T21:33:54Z

At 07:23 AM 12/11/2008, you wrote:

>Hi
>
>I have a FB 2.0.1 db (superserver) that is a back end to a web service
>application. There are two programs accessing the DB. One is an Apache
>web service (CGI), the other is a std windows app that runs on the same
>machine that pumps data back and forth between the web service db and
>another FB db.
>
>We have had very slow performance with the DB and discovered that it was
>corrupt, both times we have used GFix adn backup retore to mend the DB
>and restore performance. The only other time we have ever had problems
>with a FB DB has been a hardware issue. However we have just moved the
>system to a newer and faster machine and have the same problem (after
>half a days use).
>
>The interface app (not web service) logged the following error ...
>
>Failed to update SOAP DB, table : StockLevel. Error : ISC ERROR
>CODE:335544333
>
>ISC ERROR MESSAGE:
>internal gds software consistency check (can't continue after bugcheck)
>
>STATEMENT:
>TIBOInternalDataset: "<TApplication>.MainForm.zMainDM.CheckSOAPGe
>
>
>The query being used is a read only select statement running in a read
>only transaction.

So it is bumping into "something nasty" (a corrupt record or an out of memory condition) when trying to retrieve the set.

If your r/o transaction is long-running, make sure its isolation is Read Committed.

And, since you are using IBO, make sure the app is compiled with IBO 4.8.7.

>The only thing that is different to other DBs we have done is that this
>one uses many more triggers. Could these be causing the problem?

Release versions so far of v.2.0.x don't handle an out of memory condition properly - the best the engine can do under these conditions is to throw an internal gds consistency check error and crash. It is fixed in the forthcoming v.2.0.5. The test builds should be available within the week.

But, if this is the source of the problem, then the fix won't *prevent* an out of memory condition. I'd want to scour those triggers and find whether there are rogues there that are recursing infinitely. Another thing that can blow up memory is repeatedly hitting the same record with updates inside the one uncommitted transaction. This becomes likely if you have After triggers that are targetted at a small set of "bottleneck" records.

An out-of-memory condition can occur also if RAM is short and there's not enough temp space on disk for a sort...although in these cases you would usually see an i/o error accompanying the gds inconsistency check, if the engine had time to figure that out before dying.

>I plan to try running just the Web side of the app for a time to see if
>this app is the culprit.

If out-of-memory turns out to be the source of the problem, then it's probably also the source of the kind of corruption that you're encountering - the kind that gfix can find and fix (i.e., logical damage like broken relationships) - especially if the application code is not totally careful about task atomicity.

If the Web app is doing updates, etc., then disabling the local application won't disable problem triggers. It might mitigate the problem to the degree that you stop getting a crash condition in the engine. In that case, disabling the local app simply masks the problem.

Don't overlook the fact that a Win32 process can't use more than 2 Gb of RAM at any point. SS is one application operating many threads. If your web app is also threading within its threads then it's likely to be eating lots of resources at times...with this installation it could be just a case of too many straws on the camel's back...too many connections in the pool, perhaps?

>Does anybody have pointers for where to start looking?

With trying to be too boring...have you run MemTest over the RAM in the new machine?

./heLen