Subject Re: Kill FB processes
Author Paulius Pazera
I though it might be helpful if I share some of our experience working
with 1.5.3/4 classic

> I am running a Windows 2003 server with FB 1.5.3 classic. I have a
> fb_inet_server process that appears the be dead and I believe that it
> is causing the OAT to be "stuck".

it looks like tcp/ip keepalive feature is not enabled in 1.5.x classic
for windows. What it means that networking problems (e.g. unplugged
network/power cable) will leave fb_inet_server running forever, it
will not go away even after 2 hours (default keepalive timeout). In
many cases it may cause stuck transaction. Checked several of our
sites, transaction gap was a problem on all windows databases (1-5M
gap), while all linux databases were ok. Thanks to Paul Beach & Co. we
have custom build for windows with keepalive enabled (still testing in-
house), but we still would love to see this 'simple fix' in 1.5.5
release (yea, I know the answer, but I'm still trying ;))

> What are the ramifications of simply killing the process?

you can not avoid killing that fb_inet_server, even normal server
shutdown would also kill it (1.5.x classic does not have graceful way
to terminate fb_inet_server processes)

we are doing quite a bit of killing, mostly during upgrades when we
need to stop firebird before changing metadata. We also do some
killing to stop runaway queries. Over the years we noticed only one
real db corruption due to kill, when we were not able to backup nor
gfix fdb file ('wrong page type' error). Quite likely that corruption
was due to FW off (FW doesn't work on linux)

in many other cases (even when FW=on or
MaxUnflushedWrites=1/MaxUnflushedWriteTime=1 under windows), if we
kill fb_inet_server process which is doing many inserts/updates, and
we do gfix -validate, we usually see thousands of errors ('record
level', 'index page', 'database page'). According to ibPhoenix, most
of them are non-harmful and expected in such situations ('Page is an
orphan', 'Index has orphan child page', 'Relation has orphan
backversions'), while others may or may not indicate some problems
('Index is corrupt')

also, when firebird.log is flooded with thousands of orphan errors
it's almost impossible to spot other messages which might be a real
problem

Paulius