Subject Performance lost with lots of connections
Author selensky
Hello all,

This post is about using Firebird in enterprise environment, but
since there isn't much activity in the firebird-enterprise group
(and it is flooded with spam) I am posting this here.

It will take a lot of time to explain everything, so I will begin
with the short version. The really *long* version of my post is at
the end.

I am using Firebird 1.5.4 classic, on 4 Intel Xeon double core
hypertreaded CPUs (that makes it a total of 16 processors) with
3.25GB RAM using Windows 2003 enterprise. The application is written
in Delphi 7, using IBExpress.

The database is just about 800 megabytes.

It all works OK when there are about 10 users connected.

When I get about 40-50 connections at the same time there is a
sudden, major drop in performance.

Typical queries, that usualy take seconds, suddenly take 5 minutes
to execute.

The most noticable thing is that when I run Task manager I noticed
that during such slowdowns the CPUs are doing NOTHING. The systems
stays at 99% idle for say 10 seconds, then suddenly all the 40-50
fbserver.exes instances start working at 1-2% cpu load, then it
stops again. Some new attachments sometimes get executed almost
immediately, while the "old" ones are working really slow. For
example I noticed one fbserver.exe that has started about one hour
ago (i.e. the connection was started an hour ago) got as much as 2
minutes of total CPU time, and I am certain that it was issuing
query after query (i.e. isn't pausing its work ever!).

There is plenty of free RAM available (2GB ram used, 1 GB ram
available) at the time of the perfomance drop.

One other thing that bothers me: no matter what settings I play
with, I can never get the total CPU workload to exceed 40%. This I
think means that I have lots of free CPU time.

Has anyone experiences such behaviour?


--------------------------------------------------------------
Now the long story with lots of details, if they can be of any help

The database gets used in two major ways: reporting and replication.

About 20 to 30 users are using the above mentioned central database
for reporting.

About 150 databases spread around the country are getting and
sending partial replicas of the central database at random intervals
from 0 to 2 hours. We use our own replication technology.

The whole thing is a typical warehouse/ POS software and the 150
databases are shops. The data is mainly sales/payments/clients etc.

Everything used to work very well up until the database got to about
300-400MB. At the time it was even runing on a smaller server (4
CPUs *only* (2 cores, hypertreading), 2GB ram)

Now more and more often we are getting severe slowdowns. After a
certain amount of connections, say around 40, from which about 10
doing ocasional reporting and 30 doing replication, the whole server
suddenly becomes irresponsive.

I have played a lot with every single setting that I could have
imagined, but I couldn't find out what does it nor how to fix it.

At one time I thought the old server is simply not powerful enough
and requested a better server. The company spent thousands of
dollars (dont' ask :( ) and got me this 16 CPU monster. To my
deepest surprise it didn't help at all! I realised that something is
wrong with either the way I use firebird or with firebird itself
(oh, I hope not!).

The replication rarely does anything more complicated than INSERTs
or Updates.

I noticed that sometimes a replication might do just 10 inserts in 5
tables and it takes 20 minutes to do so. I found some badly written
triggers that select thousands of rows to check simple things. After
I removed them the performance went up big time.

STILL, whenever someone runs a complicated report, even though I
have many CPUs, after a certain number of connection the performance
*suddenly* drops.

I have a feeling that sometimes there are queries that "block" all
the other connections.

I would be deeply grateful if anyone could help.

Sorry for the long post.