Subject | Troubleshooting pause in firebird |
---|---|
Author | Deon Cui |
Post date | 2012-05-22T09:08:13Z |
Hi there,
I am running a firebird superclassic server on Scientific Linux 6.2. I
installed this using the EPEL repos. The version I am running is
2.5.1.26351.0.
This server is a dedicated firebird server and does nothing else but run
firebird and serve databases. However I am using it to serve a
multi-database backend for a published application.
What I mean by that is the users connect to a published application via 2X,
the application then talks to the firebird server to access data. Different
groups of users use different databases but the databases all serve the
same published application. In essence this is a multi-tenant hosted
environment
I have had this running very well since deployment. However today I
encountered an issue that I am having a hard time troubleshooting.
For approximately 10 mins today the firebird server stopped working. Users
launching the published application get a splash screen and would then
hang. I contacted the lead developer of the application and he tried to
launch flame robin and it too hung.
I searched the firebird logs and I could see that I received a huge number
of error 104's. I know that 104's represent a disconnect. I have previously
got them occassionally in the past but never in such a large number at the
same time.
I checked running statistics, CPU usage by the fb_smp_server process was
nil. Memory usage wasn't going up or down. As far as I could tell the
server was doing nothing.
After about 10 mins had passed everything came back right. Like if nothing
had happened. I rechecked the firebird logs and it had 3 instances of error
9 reported. I found that this meant that firebird found issues with the
network?
The whole time this was happening I was SSH'ed into the server. So I can
verify that there were no network issues preventing communication at the
time. All our application servers stopped being able to run the published
application during the pause, none reported being unable to connect or gave
any error messages. It was like connection was successful and it was
waiting for the firebird server to return data.
fbguard was running, however it does not appear to me that it was involved
in restarting the process. Is there a log file for fbguard or a way to
verify if it had to get involved? I only checked ps aux and the
fb_smp_server process start time had not changed.
I would appreciate any help or pointers in tracking this issue down. I do
not have much experience with firebird and this is the first firebird
deployment I have ever done.
Regards,
Deon
[Non-text portions of this message have been removed]
I am running a firebird superclassic server on Scientific Linux 6.2. I
installed this using the EPEL repos. The version I am running is
2.5.1.26351.0.
This server is a dedicated firebird server and does nothing else but run
firebird and serve databases. However I am using it to serve a
multi-database backend for a published application.
What I mean by that is the users connect to a published application via 2X,
the application then talks to the firebird server to access data. Different
groups of users use different databases but the databases all serve the
same published application. In essence this is a multi-tenant hosted
environment
I have had this running very well since deployment. However today I
encountered an issue that I am having a hard time troubleshooting.
For approximately 10 mins today the firebird server stopped working. Users
launching the published application get a splash screen and would then
hang. I contacted the lead developer of the application and he tried to
launch flame robin and it too hung.
I searched the firebird logs and I could see that I received a huge number
of error 104's. I know that 104's represent a disconnect. I have previously
got them occassionally in the past but never in such a large number at the
same time.
I checked running statistics, CPU usage by the fb_smp_server process was
nil. Memory usage wasn't going up or down. As far as I could tell the
server was doing nothing.
After about 10 mins had passed everything came back right. Like if nothing
had happened. I rechecked the firebird logs and it had 3 instances of error
9 reported. I found that this meant that firebird found issues with the
network?
The whole time this was happening I was SSH'ed into the server. So I can
verify that there were no network issues preventing communication at the
time. All our application servers stopped being able to run the published
application during the pause, none reported being unable to connect or gave
any error messages. It was like connection was successful and it was
waiting for the firebird server to return data.
fbguard was running, however it does not appear to me that it was involved
in restarting the process. Is there a log file for fbguard or a way to
verify if it had to get involved? I only checked ps aux and the
fb_smp_server process start time had not changed.
I would appreciate any help or pointers in tracking this issue down. I do
not have much experience with firebird and this is the first firebird
deployment I have ever done.
Regards,
Deon
[Non-text portions of this message have been removed]