Subject Re: [firebird-support] strange hang
Author Maris Paupe
I collected some data when this 'hang' happens, yep exactly as you told
firebird was using processor 100%(btw two P4 processors), I can't wait
longer than 5min to see if this operation will finish so I always have
to kill firebirs and restart it. When this strage thing happens I can't
make any connection to database, it doesn't react.
What could be the reason?
Here are some backtraces from gdb:

(gdb) thread 3
[Switching to thread 3 (Thread -1312822352 (LWP 502))]#0 0xb7f6e440 in
pthread_cond_timedwait@@GLIBC_2.3.2
() from /lib/tls/libpthread.so.0
(gdb) bt
#0 0xb7f6e440 in pthread_cond_timedwait@@GLIBC_2.3.2 () from
/lib/tls/libpthread.so.0
#1 0x08059ed3 in gds__cleanup ()
#2 0x080535bb in ?? ()
#3 0xb7f6bb63 in start_thread () from /lib/tls/libpthread.so.0
#4 0xb7e2bc4a in clone () from /lib/tls/libc.so.6
(gdb) thread 2
[Switching to thread 2 (Thread -1337988176 (LWP 3201))]#0 0xb7f6e440 in
pthread_cond_timedwait@@GLIBC_2.3.2
() from /lib/tls/libpthread.so.0
(gdb) bt
#0 0xb7f6e440 in pthread_cond_timedwait@@GLIBC_2.3.2 () from
/lib/tls/libpthread.so.0
#1 0x08059ed3 in gds__cleanup ()
#2 0x080535bb in ?? ()
#3 0xb7f6bb63 in start_thread () from /lib/tls/libpthread.so.0
#4 0xb7e2bc4a in clone () from /lib/tls/libc.so.6
(gdb) thread 1
[Switching to thread 1 (Thread -1210806144 (LWP 6279))]#0 0xb7e37e2b in
pthread_setcanceltype ()
from /lib/tls/libc.so.6
(gdb) bt
#0 0xb7e37e2b in pthread_setcanceltype () from /lib/tls/libc.so.6
#1 0xb7e254c8 in select () from /lib/tls/libc.so.6
#2 0x080be6f9 in ERRD_post ()
#3 0x080be382 in ERRD_post ()
#4 0x080cd471 in ERRD_post ()
#5 0x0804e564 in ?? ()
#6 0x0804e0be in ?? ()
#7 0xb7d69904 in __libc_start_main () from /lib/tls/libc.so.6
#8 0x0804dd41 in ?? ()


Helen Borrie wrote:

>At 01:37 PM 1/03/2005 +0200, you wrote:
>
>
>
>>my firebird server hangs, I don't know thre reason because nothing is
>>reported in logs, I can give you ps output and tell that I am using
>>rfunc library and there are 30 connections every 5 seconds. Maybe
>>database needs some network tweaking? Iuse superserver version 1.5.1. It
>>hangs randomly one time per 12h approx.
>>
>>ps afx output(see Rl state):
>>24209 ? S 0:00 /usr/lib/firebird2/bin/fbguard -f
>>24211 ? Rl 26:48 \_ /usr/lib/firebird2/bin/fbserver
>>
>>
>
>If this is ps -afx output, then no clients are connected. R means it's
>using CPU. What does "I" signify - is this something peculiarly Debian?
>
>What does a "hang" look like to you? i.e. describe what happens (or
>doesn't happen) when you think it is hanging.
>
>I'm tempted to suggest that you are seeing a GC or sweep thread running,
>though the ps tree output should show it.
>
>
>
>
>>What do you think about rfunc UDF library?
>>
>>
>
>Can't answer that - I don't use it.
>
>
>
>
>>btw, rfunc is using ib_util_malloc function from libib_util.so from
>>firebird superserver 1.5.2 package which I installed separetly(I
>>couldn't find debian package containing this file), server is installed
>>
>>
>>from debian package.
>
>I didn't know of any changes to ib_malloc between subreleases. I think
>we'd have heard a lot of howls from UDF users if it was a problem.
>
>
>
>
>>What could you suggest?
>>
>>
>
>Look at limits.conf to see whether there is a cap on the number of TCP/IP
>clients that can be connected? A shot of ps -afx | grep fb taken during
>one of these hang-ups would be interesting.
>
>./hb
>
>
>
>
>
>Yahoo! Groups Links
>
>
>
>
>
>
>
>
>
>