Subject | Re: [firebird-support] Re: Firebird SS-1.5.1 and RedHat ES 4 troubles |
---|---|
Author | Helen Borrie |
Post date | 2006-02-17T11:21:11Z |
At 06:56 PM 17/02/2006, you wrote:
through the day. An application crashing or timing out, or just
blips on the network, could cause these interferences. Are they
using wireless?
crashed. In this case, Guardian has been kicked into life by a local client.
salute?). At some point after 4 pm, the server process itself
crashes. Have you looked at the system log for nasty words like SEGFAULT?
Incidentally, that kernel bug I thought I sort-of remembered
yesterday was a red herring. It's one that affects dual-core
Opterons, where a bug in a memory management subsystem was causing
segfaults in Classic processes on that hardware and, as I recall, was
fixed with a kernel patch.
If this was me at this stage, I would be ready to take the
conservative approach: uninstall the 1.5.3 NPTL package, replace it
with the 1.5.3 "old threading" package and set the
LD_ASSUME_KERNEL=2.2.5 in both of the places suggested in the release
notes. That should at least get you back to a stable threading situation.
In reviewing your original posting, I see this:
"The only DDL task that the application accessing the db
does, is to create/drop dinamically some very simple computed fields
in a couple of tables. But this not happens very often."
- how often? once a year? once a month? once a week? every 2-3
days? more often sometimes?
- do these computed fields involve UDFs?
- WHY is your application code performing dynamic DDL at all?
I don't know what else to look at. I've got 1.5.2 SS without NPTL
support running on Mandriva 2005 on an AMD Sempron 2200 (no
hyperthreading) and it's trouble-free.
Hopefully someone has been where you are and can throw some light.
./heLen
>Helen,So far, the server is losing contact with clients occasionally
> follows what reported from firebird.log between 2 server restart:
>
>saert2.unit.net (Client) Wed Feb 15 04:08:54 2006
> /opt/firebird/bin/fbguard: guardian starting bin/fbserver
>
>
>saert2.unit.net (Server) Wed Feb 15 04:09:52 2006
> INET/inet_error: select in packet_receive errno = 9
>
>saert2.unit.net (Server) Wed Feb 15 06:33:35 2006
> INET/inet_error: select in packet_receive errno = 9
>
>saert2.unit.net (Server) Wed Feb 15 07:12:50 2006
> INET/inet_error: read errno = 9
>
>saert2.unit.net (Server) Wed Feb 15 07:53:24 2006
> INET/inet_error: select in packet_receive errno = 9
>
>saert2.unit.net (Server) Wed Feb 15 09:18:00 2006
> INET/inet_error: read errno = 104
>
>saert2.unit.net (Server) Wed Feb 15 09:18:00 2006
> INET/inet_error: read errno = 104
>
>saert2.unit.net (Server) Wed Feb 15 09:44:39 2006
> INET/inet_error: read errno = 9
>
>saert2.unit.net (Server) Wed Feb 15 13:19:14 2006
> INET/inet_error: read errno = 9
>
>saert2.unit.net (Server) Wed Feb 15 15:56:55 2006
> INET/inet_error: read errno = 9
through the day. An application crashing or timing out, or just
blips on the network, could cause these interferences. Are they
using wireless?
>saert2.unit.net (Client) Wed Feb 15 19:42:58 2006Now a local client wants to connect but the server is unavailable.
> INET/inet_error: connect errno = 111
>saert2.unit.net (Client) Wed Feb 15 19:42:58 2006Guardian is a watchdog program that restarts the server after it has
> /opt/firebird/bin/fbguard: guardian starting bin/fbserver
crashed. In this case, Guardian has been kicked into life by a local client.
>Any idea ?Not really. Something is crashing your client threads (3-finger
salute?). At some point after 4 pm, the server process itself
crashes. Have you looked at the system log for nasty words like SEGFAULT?
Incidentally, that kernel bug I thought I sort-of remembered
yesterday was a red herring. It's one that affects dual-core
Opterons, where a bug in a memory management subsystem was causing
segfaults in Classic processes on that hardware and, as I recall, was
fixed with a kernel patch.
If this was me at this stage, I would be ready to take the
conservative approach: uninstall the 1.5.3 NPTL package, replace it
with the 1.5.3 "old threading" package and set the
LD_ASSUME_KERNEL=2.2.5 in both of the places suggested in the release
notes. That should at least get you back to a stable threading situation.
In reviewing your original posting, I see this:
"The only DDL task that the application accessing the db
does, is to create/drop dinamically some very simple computed fields
in a couple of tables. But this not happens very often."
- how often? once a year? once a month? once a week? every 2-3
days? more often sometimes?
- do these computed fields involve UDFs?
- WHY is your application code performing dynamic DDL at all?
I don't know what else to look at. I've got 1.5.2 SS without NPTL
support running on Mandriva 2005 on an AMD Sempron 2200 (no
hyperthreading) and it's trouble-free.
Hopefully someone has been where you are and can throw some light.
./heLen