Subject Re: Re: [firebird-support] Even with NIC to NIC connection network request to host failed
Author Helen Borrie
At 01:25 PM 10/03/2004 +0530, you wrote:


> >
> >At least your NIC test proves there's something wrong with the network
> >configuration. Have you eliminated DHCP?
>
>Hmm, No Not with certainity. There were some problems initially with the
>configuration. This was resulting in very long connection times on this
>machine.
>But we isolated that and made the changes.
>
>When I say not with certainity I mean I dont know what test to run to be
>certain that DHCP is not causing the problems

If your apps are connecting to the IP address and, at some point, there is
a moment when no client processes are actually running, then there may be
nothing to stop DHCP from reallocating that IP address. A key thing here
is: what other services are being run on that server?

> >
> >there is no ibconfig on Linux. The configuration file for IB 6 and
> >Firebird 1 is isc_config. Consider the possibility that the server is
> >crashing because you haven't configured any temp space and/or that the
> >default temp space (the filesystem /tmp on Linux) is missing.
>
>Actually when i was lurking I saw a post to this effect and checked that.
>I also
>in my mind classified this as an unlikely reason, because in that thread
>server
>was crashing when running large sorted queries, but here we seem to be having
>trouble establishing the connection.

Yes, with Classic, it's not very likely to be the prime problem. But
resources overall are likely to be a problem, aren't they? Is anyone
monitoring memory?

> >
>
>No I am not maintaining the Linux system. The person maintaining the Linux
>system is reasonably competent on Linux. He maintains Linux systems on a
>number
>of customer places. He is maintaining at least 40 to 50 servers.
>
>But the trouble is I dont know what questions to ask. If there was some
>sort of
>checklist somewhere I could ask the person to check it out.

Or you could turn that around the other way and ask the network guy for a
checklist. Like - is the IP address stable or is it subject to change? Do
we have enough memory available to accommodate all these connections?


>With troubleshooting like this you need intution. I have good intuition on
>windows, hardly any with Linux

Well, that's why I asked if you had network assistance. You'd need to ask
the n/w admin what else is competing with your database server for
resources. A server with 512 Mb is pretty light for the number of
connections you are running.

> >
> >>Both of these have been left more or less at the default.
> >>Could it be that the server is running out of memory and thus unable to
> >>create a new process on the server ??
> >
> >That depends on which server you are running. If it's Classic then, yes,
> >it's a very real possibility that 512 Mb is going to get used up if you are
> >allowing unlimited connections. If it's Superserver, who knows?
>
>It is Firebird Classic 1.0.3. There are roughly 90 single threaded
>connections which are
>running. 30 will be seeing lots of activity, 60 mainly on timer tick for a few
>seconds (But the connections dont get closed).

Let's suppose you have the defaults for page size (4 Kb) and cache (75
pages). For each of those connections you have 300 Kb of static cache plus
one instance of gds_inet_server + gds_lock_mgr for each (another 150
Kb). So there's about 4 Mb not counting any activity being done by those
connections. And you say these connections are stable, fine.

>The multithreading app opens one thread in the main thread and is holding it
>open. Around once in 10 mins it tries to open 10 connections one after the
>other
>for use in threads.
>
>This is always the place where the trouble starts.

So it's not untoward to suppose that one of those requests tips over
because it can't get the memory resources required.

>So what do I ask the Linux guy to check? Any hints

It would be worth finding out from him how much of that 512 Mb is actually
available for the server processes.

> >
> >The error message you are getting ("The system could not find the
> >environment option that was entered") looks like a custom message. Is it
> >one you made up yourself to respond to a specific ISC error code? Have you
> >explored the conditions that cause that ISC error code?
>
>Hmm, Not it is not. That is the most puzzling thing. I am using an exception
>logger. Originally the message would say
>
>ISC ERROR MESSAGE:
>Unable to complete network request to host "192.168.1.202".
>Failed to establish a connection.

That one suggests a break in the network - a dynamically changed host IP
address would be that.

>After this NIC to NIC experiment.This system could not find the environment
>option that was entered is appended

That's weird -- "environment option". Is it possible the app is trying to
connect to a database that isn't on the server's filesystem?

/hb