Subject FireBird 1.0 server crash with stack overflow
Author Riaan Nagel
Hi,

One of our dedicated Win2K FireBird 1.0 servers crashed a few days
ago, without any entry in the interbase.log and with only the
following message in the System Event Viewer:

"The Firebird Server service terminated unexpectedly. It has done
this 1 time(s). The following corrective action will be taken in 0
milliseconds: No action."

Both the FireBird server process and FireBird Guardian clean exited,
so the FireBird server had to be re-started.

Also, earlier the same day, there had been a stack overflow error
reported in the application, which seemed to come from the database:

"Stack overflow. The resource requirements of the runtime stack have
exceeded the memory available to it."

This was followed by connections to the database server being reset,
with the usual message:

Unable to complete network request to host "<server IP address>".
Error writing data to the connection. An existing connection was
forcibly closed by the remote host.

Could this be related to the incorrect usage of a SUSPEND statement
in an executable procedure?

This was during normal operating hours, towards the end of the
business day (not during maintenance or bulk updates). The only
abnormal behavior before the crash was that, shortly before ibserver
crashed, memory usage (page file bytes of ibserver.exe) suddenly went
up by almost 50MB (from ± 258MB to ± 307MB), after being mostly
stable until then. After the FireBird server was restarted manually,
its memory usage went up to the abnormally high level again and has
been climbing slowly since (another 20MB in a week's time).

The only recent interaction with the database, outside of normal
operation, was a large batch update (affecting at least 1 million
rows) and metadata changes with the upgrade to a new version of the
server application the previous night.

"Normal operation" is use in an on-line transaction processing
application, which provides SOAP services over the internet, with a
large national installed base (of the client application) and a
transaction throughput of at least 10 000 client requests daily.
This translates into about 6 times as many database operations (of
which two are updates or inserts) and transactions starting and
committing. We are not using retaining commits or rollbacks. The
application uses connection pooling (10 connections) and dbExpress
access components.

Database is set up to not do automatic sweeping, has forced writes
turned on, 4KB page size, 40 000 page buffers, is around 1GB in size,
in a single file, with no shadows and no replication and is backed up
every evening. We are not currently doing a manual sweep.

The server configurations are as follows:
Processor: 1.2GHz
Physical Memory: 256MB
Virtual Memory: 384MB
Nic: Intel Pro100 S (x2)
Windows 2000 SP3 Running IIS5
Software mirroring is done to one other hard drive.

Our SOAP server application runs on a separate server, connected to
the Internet, and is connected to the dedicated database server by a
dedicated crossover cable. (hence the two network cards - and two IP
addresses - for each server)

Any help or clues to answers would be MUCH appreciated!

Thanks,
Riaan