firebird-java - Re: [Firebird-Java] reuse of connection after failure.

Subject	Re: [Firebird-Java] reuse of connection after failure.
Author	Roman Rokytskyy
Post date	2007-03-12T15:47:02Z

Hi,

> Maybe a red herring but we have notices one or two SIGSEV in the fire
> logs. We are running CS 2.01 RC2 on a separate machine. The problem
> seems to happen more frequently when we are under heavy load.
> Approximately 50 000+ transactions (1 000 000+ queries/inserts) per
> hour. The system may run normally for hours/days and then trigger this
> on one machine in the cluster. Under heavy load it may trigger in 30 or
> so minutes. I am guessing when Firebird sigsevs we loose the connection
> and the connection pool is not recovering properly.

Yes, this is what happens.

> After changing the
> defaults on the connection pool (hard coded see below) so that it tends
> to throw away connections more often maxIdleTime = 15 and pings
> connections, we seem to recover easier.
>
> org.firebirdsql.pool.FBPoolingDefaults
> public static final int DEFAULT_IDLE_TIMEOUT = 10 * 1000;
> public static final int DEFAULT_PING_INTERVAL = 5 * 1000;
>
> org.firebirdsql.pool.BasicAbstractConnectionPool
> pingStatement = "SELECT CAST(1 AS INTEGER) FROM RDB$DATABASE";
>
>
> My real question however is how to get these hard coded defaults into
> the JBOSS firebird-ds.xml data source configuration file, so that I can
> use standard releases in the future

This can't work. The issue is that in your configuration you're using the
JCA connections, which have nothing to do with our pool.

The exceptions that you see show the following:

1. JBoss tries to start a transaction. Since the FB crashed and current
socket was already closed, Jaybird throws an exception (not shown in your
stack trace, but it is something like "Cannot connect to server" or "Error
reading data from connection").

2. JBoss creates a new connection and system runs further.

3. Transaction Coordinator from JBoss notices that some XA transaction is
not yet finished (the one, during which FB crashed) and tries to rollback
it. It uses the corresponding mechanisms of XA-enabled connections
(rollback method with Xid but with no managed connection).

4. Jaybird connects to a restarted Firebird and tries to end the in-limbo
transaction.

Now the interesting part happens. Here's the code:

try {
if (tempLocalTx != null && tempLocalTx.inTransaction())
tempLocalTx.commit();
} finally {
if (tempMc != null) tempMc.destroy();
}

The error you specified happens in tempMc.destroy(). This managed
connection was used to end the in-limbo transaction. For this it also
started new transaction (it makes a database query to find in-limbo
transactions) which it should close before destroying the managed
connection. Since it did not happen, it looks like something happened
before.

Now the question is - what? You can check whether the same exceptions
happen when you manually kill FB during low load. If this happens to be
reproducable, we can track this down.

In general, what you see is not something too dangerous. The worst thing
that can happen are the in-limbo transactions (which will inhibit the
garbage collection), but you can fix that with gfix. The second stack trace
shows only that JBoss was unable to complete the in-limbo transactions
automatically.

And, finally, if you don't really need XA transactions, switch to local
transactions. They are more lightweight also for Firebird (and JBoss too).

Roman