Subject Re: [firebird-support] Firebird is slower on multicpu Window box then single cpu
Author Geoff Worboys
Hi David,

> This behavior is consistent with the "ping-pong" effect that
> rapid fire context switching can have in Hyperthreaded Windows
> systems. The CPU becomes tied up doing nothing but context
> switching between threads.

You are mixing up the old SMP thrashing issue with a different
problem that seems to be highlighted by new HT capable machines.

Yes, on real SMP (HT is not real SMP) there is apparently a
potential for FB to spend too much time context switching, a
problem that can be worked around using CPU affinity (at the
cost of not taking advantage of multiple processors).

HT capable machines have a different problem that I have only
ever seen when transferring data between databases using a
TCPIP connection (even localhost). A problem that does not
exist on Windows local connections. The symptom is for the
transfer to freeze for periods of time, waking up occassionally
to do a bit more work and then freezing again. You can wake
it up deliberately by doing anything on a separate TCPIP
database connection to the same server.

(Note that it "freezes", the CPUs are not doing anything and
are certainly not thrashing.)

I have not seen the problem when just pumping data directly
from another source, only when transfering between databases,
but that does not mean it may not also happen in other
situations. Note that it does not seem to occur with gbak.

In the version 2.0 alpha release notes Nickolay Samofatov
mentions finding a "A rare race condition" (see under the
heading for "Changes to synchronization logic"). It seems
likely that may be the problem we are seeing with HT capable
machines (that there is something about newer CPUs that raise
the problem from rare to almost predictable).

On single physical CPU machines the solution to the HT problem
is to disable HT - AFAIK this usually works. On dual physical
CPU machines (4 HT processors) the problem may even be more
severe if you disable HT.

With dual physical CPU the symptoms can be alleviated by
enabling HT and spreading CPU affinity across at least the two
logical processors that belong to the same physical processor
OR (what seems to work better on my machine) is to spread
affinity across all four processors. But note that this only
seems to alleviate the problem, it does not resolve it
completely.

I have asked Nickolay if it may be possible to back-port the
v2 fix to FB 1.5.2 because I consider this to be quite serious.
However apparently there are issues with such backport so I do
not know if it will happen. If enough other people can
highlight whether the HT issue is causing them real problems
then perhaps we can have it treated with a higher priority.

--
Geoff Worboys
Telesis Computing