Subject Re: [firebird-support] Problem with FB database that freezes
Author Thomas Steinmaurer
Hi Eduardo,

[snip]

> Firebird.conf:
>
> ---------------------
>
> DefaultDbCachePages = 1024
>
> #FileSystemCacheThreshold = 65536 (commented out)
>
> #FileSystemCacheSize = 0 (commented out)
>
>
> Server environment:
>
> --------------------------
>
> CPU utilization: 11%
>
> Memory utilization: 11 GB (out of 32)
>
>
> Note.- Even when the DB performance is down, this values are in the same
> range or even lower. No swapping.
>
>
> gstat output (normal performance):
>
> ---------------------------------------------------------
>
> Database header page information:
> Flags 0
> Checksum 12345
> Generation 19572161
> Page size 16384
> ODS version 11.2
> Oldest transaction 18709808
> Oldest active 18953295
> Oldest snapshot 18851591
> Next transaction 19520857
> Bumped transaction 1
> Sequence number 0
> Next attachment ID 50438
> Implementation ID 26
> Shadow count 0
> Page buffers 3000

Possibly a bit high for Classic, but might be ok. You might increase
that to e.g. 2048 and spend more RAM on the temporary storage module for
ordering/group by results etc.


> Next header page 0
> Database dialect 1
> Creation date Jul 7, 2015 7:00:57
> Attributes no reserve
>
> Variable header data:
> Database backup GUID: {BF8D26E0-970E-431A-7FAD-E2D9BDB2E4DA}
> Sweep interval: 0
> *END*

You have a long-running active transaction, cause the gap between Next
Transaction - Oldest Active is quite high. As Sean mentioned, use the
monitoring tables to identify the long-running process. As you are using
FB TraceManager, it is a simple mouse-click via the context-menu in the
parsed gstat output to locate the OAT in the monitoring tables.


> Note.- We seep the database manually each night.

Do transaction counters move on with that?



> fb_lock_print output (normal performance):
>
> ----------------------------------------------------------------
>
> LOCK_HEADER BLOCK
> Version: 145, Active owner: 0, Length: 28311552, Used: 27588104
> Flags: 0x0001
> Enqs: 69364533, Converts: 192066, Rejects: 36029, Blocks: 282250
> Deadlock scans: 7, Deadlocks: 0, Scan interval: 10
> Acquires: 77720068, Acquire blocks: 2159883, Spin count: 0
> Mutex wait: 2.8%
> Hash slots: 1009, Hash lengths (min/avg/max): 51/ 66/ 81
> Remove node: 0, Insert queue: 0, Insert prior: 0
> Owners (145): forward: 441288, backward: 98120
> Free owners (11): forward: 24695928, backward: 23070064
> Free locks (2963): forward: 22024, backward: 27499760
> Free requests (42905): forward: 22145288, backward: 25253392
> Lock Ordering: Enabled

You have very high hash lengths values (51-81) running with the default
hash slots value of 1009. Increase it to at least 10009 or even higher
in firebird.conf. It is recommended that the value is a prime number.
AVG hash lengths should not be larger than ~ 20.


--
With regards,
Thomas Steinmaurer
http://www.upscene.com/

Professional Tools and Services for Firebird
FB TraceManager, IB LogManager, Database Health Check, Tuning etc.


> Firebird.log (IBMCASA is the server's host name)
>
> ------------------------------------------------------------------
>
> The log is literally FULL of 10053 and 10054 error entries like the
> following:
>
>
> IBMCASA Thu Jul 23 10:27:27 2015
> Unable to complete network request to host "IBMCASA".
> Error writing data to the connection.
>
>
> IBMCASA Thu Jul 23 10:27:29 2015
> Unable to complete network request to host "IBMCASA".
> Error reading data from the connection.
>
>
> IBMCASA Thu Jul 23 10:27:30 2015
> INET/inet_error: read errno = 10054
>
>
> According to the log, this errors seems to be happening every second or
> every few seconds/minutes, since March 8 2014 and until today even as
> I'm writing this. Each day, this errors stop at 11:49 PM when the last
> users stop working on the client apps, then they'll start again every
> morning at 6:00 AM when the first client apps connect to the database.
>
>
>
>