Subject | Re: Firebird SS-1.5.1 and RedHat ES 4 troubles |
---|---|
Author | axsp2000 |
Post date | 2006-02-17T16:15:07Z |
--- In firebird-support@yahoogroups.com, Helen Borrie <helebor@...> wrote:
checked the log, no SEGFAULT anywhere.
the firebird server is freezed.
with fbmgr command "shut" timeout 2 minutes, and message :
"unable to complete network request to host 'localhost'
failed to establish a connection
can not attach to server"
No way to restart it by the fbmgr. Killing fbserver manually solve
ther problem.
Run gstat on a database follow the result after a couple o minute of
waiting:
Database "/DATABASE/WELL_DBS/Remsa_01.gdb"
Database header page information:
Flags 0
Checksum 12345
Generation 92693
Page size 4096
ODS version 10.1
Oldest transaction 91779
Oldest active 91780
Oldest snapshot 58861
Next transaction 92686
Bumped transaction 1
Sequence number 0
Next attachment ID 0
Implementation ID 19
Shadow count 0
Page buffers 0
Next header page 0
Database dialect 3
Creation date Jul 26, 2005 13:14:53
Attributes
Variable header data:
Sweep interval: 20000
*END*
Database file sequence:
File /DATABASE/WELL_DBS/Remsa_01.gdb is the only file
Database log page information:
Creation date
Log flags: 2
No write ahead log
Next log page: 0
Variable log data:
Control Point 1:
File name:
Partition offset: 0 Seqno: 0 Offset: 0
Control Point 2:
File name:
Partition offset: 0 Seqno: 0 Offset: 0
Current File:
File name:
Partition offset: 0 Seqno: 0 Offset: 0
*END*
Unable to complete network request to host "localhost".
-Failed to establish a connection.
With "top", following results :
VIRT RES SHR %CPU TIME
fbguard 3340 1296 S 0 0:0:0
fbserver 325m 28m S 0 2:21:68
73 tasks running
72 tasks sleeping
0 stopped
0 zombie
CPU 1.0% us 4.4% sy 0.0% ui 90.5% id 0.0% wa 0.7%
hi 2.0% si
MEM 256044k total 254040k used 2008k free 27264k buffer
SWA 265064k total 192k used 264872k free 150368k cached
The computed fields uses UDF but they're crated/dropped once a month
usually. Sometimes more often but this is not the case.
WHY is your application code performing dynamic DDL at all?
Because sometimes we require to show some computation on some reports
to a lot of client and this was the simplest and quickest way without
involving other application layer.
We'll try with the non NPTL version according your suggestion.
Thank you
Alessandro
>local client.
> At 06:56 PM 17/02/2006, you wrote:
>
> >Helen,
> > follows what reported from firebird.log between 2 server restart:
> >
> >saert2.unit.net (Client) Wed Feb 15 04:08:54 2006
> > /opt/firebird/bin/fbguard: guardian starting bin/fbserver
> >
> >
> >saert2.unit.net (Server) Wed Feb 15 04:09:52 2006
> > INET/inet_error: select in packet_receive errno = 9
> >
> >saert2.unit.net (Server) Wed Feb 15 06:33:35 2006
> > INET/inet_error: select in packet_receive errno = 9
> >
> >saert2.unit.net (Server) Wed Feb 15 07:12:50 2006
> > INET/inet_error: read errno = 9
> >
> >saert2.unit.net (Server) Wed Feb 15 07:53:24 2006
> > INET/inet_error: select in packet_receive errno = 9
> >
> >saert2.unit.net (Server) Wed Feb 15 09:18:00 2006
> > INET/inet_error: read errno = 104
> >
> >saert2.unit.net (Server) Wed Feb 15 09:18:00 2006
> > INET/inet_error: read errno = 104
> >
> >saert2.unit.net (Server) Wed Feb 15 09:44:39 2006
> > INET/inet_error: read errno = 9
> >
> >saert2.unit.net (Server) Wed Feb 15 13:19:14 2006
> > INET/inet_error: read errno = 9
> >
> >saert2.unit.net (Server) Wed Feb 15 15:56:55 2006
> > INET/inet_error: read errno = 9
>
> So far, the server is losing contact with clients occasionally
> through the day. An application crashing or timing out, or just
> blips on the network, could cause these interferences. Are they
> using wireless?
>
> >saert2.unit.net (Client) Wed Feb 15 19:42:58 2006
> > INET/inet_error: connect errno = 111
>
> Now a local client wants to connect but the server is unavailable.
>
>
> >saert2.unit.net (Client) Wed Feb 15 19:42:58 2006
> > /opt/firebird/bin/fbguard: guardian starting bin/fbserver
>
> Guardian is a watchdog program that restarts the server after it has
> crashed. In this case, Guardian has been kicked into life by a
>SEGFAULT?
> >Any idea ?
>
> Not really. Something is crashing your client threads (3-finger
> salute?). At some point after 4 pm, the server process itself
> crashes. Have you looked at the system log for nasty words like
>situation.
> Incidentally, that kernel bug I thought I sort-of remembered
> yesterday was a red herring. It's one that affects dual-core
> Opterons, where a bug in a memory management subsystem was causing
> segfaults in Classic processes on that hardware and, as I recall, was
> fixed with a kernel patch.
>
> If this was me at this stage, I would be ready to take the
> conservative approach: uninstall the 1.5.3 NPTL package, replace it
> with the 1.5.3 "old threading" package and set the
> LD_ASSUME_KERNEL=2.2.5 in both of the places suggested in the release
> notes. That should at least get you back to a stable threading
>Helen,
> In reviewing your original posting, I see this:
> "The only DDL task that the application accessing the db
> does, is to create/drop dinamically some very simple computed fields
> in a couple of tables. But this not happens very often."
>
> - how often? once a year? once a month? once a week? every 2-3
> days? more often sometimes?
> - do these computed fields involve UDFs?
> - WHY is your application code performing dynamic DDL at all?
>
> I don't know what else to look at. I've got 1.5.2 SS without NPTL
> support running on Mandriva 2005 on an AMD Sempron 2200 (no
> hyperthreading) and it's trouble-free.
>
> Hopefully someone has been where you are and can throw some light.
>
> ./heLen
>
checked the log, no SEGFAULT anywhere.
the firebird server is freezed.
with fbmgr command "shut" timeout 2 minutes, and message :
"unable to complete network request to host 'localhost'
failed to establish a connection
can not attach to server"
No way to restart it by the fbmgr. Killing fbserver manually solve
ther problem.
Run gstat on a database follow the result after a couple o minute of
waiting:
Database "/DATABASE/WELL_DBS/Remsa_01.gdb"
Database header page information:
Flags 0
Checksum 12345
Generation 92693
Page size 4096
ODS version 10.1
Oldest transaction 91779
Oldest active 91780
Oldest snapshot 58861
Next transaction 92686
Bumped transaction 1
Sequence number 0
Next attachment ID 0
Implementation ID 19
Shadow count 0
Page buffers 0
Next header page 0
Database dialect 3
Creation date Jul 26, 2005 13:14:53
Attributes
Variable header data:
Sweep interval: 20000
*END*
Database file sequence:
File /DATABASE/WELL_DBS/Remsa_01.gdb is the only file
Database log page information:
Creation date
Log flags: 2
No write ahead log
Next log page: 0
Variable log data:
Control Point 1:
File name:
Partition offset: 0 Seqno: 0 Offset: 0
Control Point 2:
File name:
Partition offset: 0 Seqno: 0 Offset: 0
Current File:
File name:
Partition offset: 0 Seqno: 0 Offset: 0
*END*
Unable to complete network request to host "localhost".
-Failed to establish a connection.
With "top", following results :
VIRT RES SHR %CPU TIME
fbguard 3340 1296 S 0 0:0:0
fbserver 325m 28m S 0 2:21:68
73 tasks running
72 tasks sleeping
0 stopped
0 zombie
CPU 1.0% us 4.4% sy 0.0% ui 90.5% id 0.0% wa 0.7%
hi 2.0% si
MEM 256044k total 254040k used 2008k free 27264k buffer
SWA 265064k total 192k used 264872k free 150368k cached
The computed fields uses UDF but they're crated/dropped once a month
usually. Sometimes more often but this is not the case.
WHY is your application code performing dynamic DDL at all?
Because sometimes we require to show some computation on some reports
to a lot of client and this was the simplest and quickest way without
involving other application layer.
We'll try with the non NPTL version according your suggestion.
Thank you
Alessandro