Subject Re: [firebird-support] Problem with FB database that freezes
Author

Hi Mark,


Thanks for your insightful response. FWIW, I would like to mention that, in the same server, we have another database (same size ~7 GB) no one connects to, it's a restore of the production database from January this year. This database works perfectly even when the production database is down. We try only a few test connections though.


Below is some of the requested information, at a time when the production database performance is normal.


I beg you to please read it until the end. You might have nailed something.


Thanks again. Hope to hear from you soon,


-Eduardo


Firebird.conf:

---------------------

DefaultDbCachePages = 1024

#FileSystemCacheThreshold = 65536 (commented out)

#FileSystemCacheSize = 0 (commented out)


Server environment:

--------------------------

CPU utilization: 11%

Memory utilization: 11 GB (out of 32)


Note.- Even when the DB performance is down, this values are in the same range or even lower. No swapping.


gstat output (normal performance):

---------------------------------------------------------

Database header page information:
 Flags   0
 Checksum  12345
 Generation  19572161
 Page size  16384
 ODS version  11.2
 Oldest transaction 18709808
 Oldest active  18953295
 Oldest snapshot  18851591
 Next transaction 19520857
 Bumped transaction 1
 Sequence number  0
 Next attachment ID 50438
 Implementation ID 26
 Shadow count  0
 Page buffers  3000
 Next header page 0
 Database dialect 1
 Creation date  Jul 7, 2015 7:00:57
 Attributes  no reserve

    Variable header data:
 Database backup GUID: {BF8D26E0-970E-431A-7FAD-E2D9BDB2E4DA}
 Sweep interval:  0
 *END*


Note.- We seep the database manually each night.


fb_lock_print output (normal performance):

----------------------------------------------------------------

LOCK_HEADER BLOCK
 Version: 145, Active owner:      0, Length: 28311552, Used: 27588104
 Flags: 0x0001
 Enqs: 69364533, Converts: 192066, Rejects:  36029, Blocks: 282250
 Deadlock scans:      7, Deadlocks:      0, Scan interval:  10
 Acquires: 77720068, Acquire blocks: 2159883, Spin count:   0
 Mutex wait: 2.8%
 Hash slots: 1009, Hash lengths (min/avg/max):   51/  66/  81
 Remove node:      0, Insert queue:      0, Insert prior:      0
 Owners (145): forward: 441288, backward:  98120
 Free owners (11): forward: 24695928, backward: 23070064
 Free locks (2963): forward:  22024, backward: 27499760
 Free requests (42905): forward: 22145288, backward: 25253392
 Lock Ordering: Enabled



Firebird.log (IBMCASA is the server's host name)

------------------------------------------------------------------

The log is literally FULL of 10053 and 10054 error entries like the following:


IBMCASA Thu Jul 23 10:27:27 2015
 Unable to complete network request to host "IBMCASA".
 Error writing data to the connection.


IBMCASA Thu Jul 23 10:27:29 2015
 Unable to complete network request to host "IBMCASA".
 Error reading data from the connection.


IBMCASA Thu Jul 23 10:27:30 2015
 INET/inet_error: read errno = 10054


According to the log, this errors seems to be happening every second or every few seconds/minutes, since March 8 2014 and until today even as I'm writing this. Each day, this errors stop at 11:49 PM when the last users stop working on the client apps, then they'll start again every morning at 6:00 AM when the first client apps connect to the database.