Subject Re: [firebird-support] Firebird Program loses network connection
Author Norman Dunbar
Morning all,

> I have encountered a weird network problem and have no idea what might be the cause of it.
As far as I'm concerned, all network problems are weird! ;-)

> ...
> Client connection path to the database is using TCP/IP
> Situation: Client's program frequently shows
> Error writing data to the connection
> An existing connection was forcibly closed by the remote host
That sounds to me as if the server end is killing the connection as
opposed to the connection being "lost" due to nnetwork problems or the
client end failing for some reason.

> Client stated it happened after using the program for some time.
> As testing, I tried saving couple of big transaction on the client computer and will encountered the same error. Note: Processing time for the transaction takes less then 3 mins.
Is the transaction doing something CPU intensive, for example a huge
sort? If so, it's possible that there is no keep-alive on the network
and some configuration option on the server is killing off "dead"
connections.

This is something I get occasionally with my Oracle database servers at
work. They are configured (and I know not how!) to drop dead connections
after a certain time. When the clients submits a huge CPU intensive
transaction, it runs and then because there's no network traffic, the
server assumes the connection has died and drops it.

When the application tries to talk again to the server, it gets the
message that you are seeing - force closed by remote host.

> ...

> Testing on network so far shows it is stable.
Probbaly because there is plenty of traffic during the test. You need to
have a connection initiate with the server and then, do nothing.
Possibly an SSH to the server could be made and then simply left alone
to see if it too dies? Of course, you need to try to get a response from
the server in order to know if it has died or not which means that you
have effectively reset the aliveness of the connection.


> Perform test by pinging the server and then transferring file to the server while the program is running. Will get the same error above but ping and file transfer continue without hiccup.
I think, but I'm not a network guru, that ping uses UDP which is not TCP.

> Perform another test by having two program open. Only the program running the test above will crashed with error and the second program can continue being used without any problem.
Do both programs run the same large transaction? What happens if both
do? I predict both will lose the connection. Also, when the client is
running the large transaction, what does Task Manager show for the
client app - I suspect Not Responding, but I might be wrong.

And also, while the transaction is running, check the network traffic
(start->settings->Network and Dialup connections, double click on LAN
and see what's happening) - are you seeing sent/received clocking up
packet counts while the transaction is running - at the same rate as
before the transaction? Obviously there will be background traffic -
your email, other stuff, but you might get a clue. Maybe!

> ...

> Also suspected something is not right with the program but running the same test at my own office with lower end computer and much heavier transaction will not have any problem. I pretty much given up and not sure what so unique about this client network that keep getting this error.
As I said, I'm not a network guru, but it does sound familiar. I suspect
that the server has been configured, somehow, to kill off dead
connections and your long running transaction is thought to be dead.

When you did your test on the low spec machine, were you running exactly
the same transaction or a different one? Anything that talks across the
network while running is going to stay alive, anything that hits the
server for a long period of CPU, for example, is possibly going to be
dropped.

I reapet, I'm not a network guru and I'm sure others will jump in and
assist you here, but the above may be worth considering.


And finally, the following sage advice comes from the firebird.conf file:

# Normally, Firebird uses SO_KEEPALIVE socket option to keep track of
# active connections. If you do not like default 2-hour keepalive
# timeout then adjust your server OS settings appropriately. On
# UNIX-like OS's, modify contents of /proc/sys/net/ipv4
# /tcp_keepalive_*. On Windows, follow instrutions of this article:
# http://support.microsoft.com/default.aspx?kbid=140325

As Ann would say, good luck!


Cheers,
Norm.

--
Norman Dunbar
Dunbar IT Consultants Ltd

Registered address:
Thorpe House
61 Richardshaw Lane
Pudsey
West Yorkshire
United Kingdom
LS28 7EL

Company Number: 05132767