Subject Re: [IB-Architect] IB/FB Lock manager failures
Author Jim Starkey
At 06:22 PM 9/15/2002 +0400, Dmitry Yemanov wrote:
>
>We all know that SuperServer engine is inacceptable for doing most serious
>tasks, because:
>1) It doesn't execute requests in parallel
>2) It doesn't support SMP (all common computers will be SMP soon with
>HyperThreading)
>3) Server failure terminates all user queries
>
>While doing lock manager rework to solve "OBJEXT xx IS IN USE" issue I
>discovered
>that it is much easyer to move lock manager data to shared memory than to
>support current
>CS lock architecture, witch tends to slow when number of connections grows.
>
>I.e. I allocate shared memory pool protected with some kind of
>mutex/spinlock.
>Move lock manager data there (as it is represented in SS architecture).
>Than move buffer cache there too. This should solve all current CS
>architecture problems.
>We may drop SS architecture after that or rework it to work like Oracle
>Multi-Threaded Server works.
>Suggestions ?
>

There are four models under discussion here:

1. Classic -- independent processes sharing a common lock manager.

2. SuperServer -- single, coarse-grained, multi-threaded process
containing an internal lock manager. Current implementation has a
single mutex protecting virtually all internal data structures.

3. Fine-grained, multi-threaded single process server. This is the
goal of Firebird 2.

4. Your proposed multi-process server with a shared lock
manager and page cache.

There was a fifth model, essentially classic with a network
accessable page/lock server, that I never (for very good
reasons) got around to finishing.

There are a variety of architectural problems with multi-proccess
architectures:

1. Historically, System V shared memory is allocated out of
non-paged physical memory, making it a scarce and expensive
resource. Memory is now essentially free, not all Unix
developers are as dim as AT&T, so this may no longer be a
factor.

2. Inter-process mutexes are computationally very, very expensive.
Intra-process thread mutexes are much cheaper than inter-process
mutexes, and multi-state (shared/exclusive) locks are much cheaper
that even intra-process mutexes for the most common share/shared
interactions.

3. Shared memory cannot be reliably mapped into a fixed virtual
location, so any data structures in shared memory must use
variations on self-relative pointers, making the code necessary
to handle them messy and slow. All implementations of the
Interbase/Firebird lock manager share these characteristics.

4. Any crash of server process can corrupt the shared data structures,
crashing all processes. While this is no worse than the threaded
server cases, it is no better, either.

Moving the lock manager and page cache into shared memory is insufficient.
The Firebird cache manager also tracks page write precedence to enforce
careful write. This necessarily requires that the cache manager
data structures also be moved into shared memory, modified to handle
self-relative pointers, and protected with mutexes. This means, in
turn, that every reference through a BDB (buffer descriptor) will
require an expensive multi-process mutex. Much of Interbase/Firebird
performance is based on a dirt cheap page cache hit cycle, which
would be significantly degraded by the need to interlock the data
structures in shared memory across multiple processes.

Your question really isn't about a multi-process server vs.
SuperServer, but a multi-process server vs. a future fine grain
multi-threaded, single process server. In my mind, there are
two metrics to determine the preferred answer: Relative
cost/risk of development and relative performance.

It is to make an accurate prediction on the relative costs. Interlocking
the internal data structures for high performance, fine grained
multi-threading is very, very difficult, particularly under extremely
heavy load (Netfrastructures uses this architecture, so I have
some experience with what I say). On the other hand, designing
and implemented partially internal, partially shared cache
manager is no picnic.

However, it is easy to compare the relative potential performances
of the two architectures: A single process, fine grained, multi-
threaded server will blow the doors off a multi-process, shared
memory server due to the huge difference in data structure
synchronization cost.

I'm not going to argue that a clever programmer can't find a way
to adapt the Classic architecture to a shared page cache design.
But I will argue that that design is inferior in performance to a
fine granularity, multi-threaded server and, more to point, the
effort expended will add nothing to the development of fine
granuality, multi-threaded server.

Jim Starkey