firebird-architect - Re: [IB-Architect] IB/FB Lock manager failures

Subject	Re: [IB-Architect] IB/FB Lock manager failures
Author	Steven Shaw
Post date	2002-09-17T14:34:59Z

From: "Jim Starkey" <jas@...>

> At 06:22 PM 9/15/2002 +0400, Dmitry Yemanov wrote:
> >
> >We all know that SuperServer engine is inacceptable for doing most

serious

> >tasks, because:
> >1) It doesn't execute requests in parallel
> >2) It doesn't support SMP (all common computers will be SMP soon with
> >HyperThreading)
> >3) Server failure terminates all user queries
> >
> >While doing lock manager rework to solve "OBJEXT xx IS IN USE" issue I
> >discovered
> >that it is much easyer to move lock manager data to shared memory than to
> >support current
> >CS lock architecture, witch tends to slow when number of connections

grows.

> >
> >I.e. I allocate shared memory pool protected with some kind of
> >mutex/spinlock.
> >Move lock manager data there (as it is represented in SS architecture).
> >Than move buffer cache there too. This should solve all current CS
> >architecture problems.
> >We may drop SS architecture after that or rework it to work like Oracle
> >Multi-Threaded Server works.

Dmitry, sorry if I missed an earlier post but could you explain the
highlights of how the Oracle multi-threaded server works?

> >Suggestions ?
> >
>
> There are four models under discussion here:
>
> 1. Classic -- independent processes sharing a common lock manager.
>
> 2. SuperServer -- single, coarse-grained, multi-threaded process
> containing an internal lock manager. Current implementation has a
> single mutex protecting virtually all internal data structures.
>
> 3. Fine-grained, multi-threaded single process server. This is the
> goal of Firebird 2.
>
> 4. Your proposed multi-process server with a shared lock
> manager and page cache.
>
> There was a fifth model, essentially classic with a network
> accessable page/lock server, that I never (for very good
> reasons) got around to finishing.

Jim, do your think that a network accessible page/lock server is a good
idea? By "good reasons" I can't tell whether you mean it was a bad idea or
if priorities lay elsewhere.

>
> There are a variety of architectural problems with multi-proccess
> architectures:
>
> 1. Historically, System V shared memory is allocated out of
> non-paged physical memory, making it a scarce and expensive
> resource. Memory is now essentially free, not all Unix
> developers are as dim as AT&T, so this may no longer be a
> factor.

Are you saying that AT&T developers were dim to require that shared memory
pages remain in physical memory? Should they be allowed to swap out? Does it
matter anyway since, as you point out, memory is essentially free?

>
> 2. Inter-process mutexes are computationally very, very expensive.
> Intra-process thread mutexes are much cheaper than inter-process
> mutexes, and multi-state (shared/exclusive) locks are much cheaper
> that even intra-process mutexes for the most common share/shared
> interactions.

Jim, are you talking about semaphores? Aren't there other options for
inter-process mutual exclusion? spin-locks?

I know I need to go back to school but could you exludiate (just a little)
on why intra-process locks are quicker than inter-process locks. Cheers.

> 3. Shared memory cannot be reliably mapped into a fixed virtual
> location, so any data structures in shared memory must use
> variations on self-relative pointers, making the code necessary
> to handle them messy and slow. All implementations of the
> Interbase/Firebird lock manager share these characteristics.

I have seen shared memory mapped to a fixed virtual location work across
AIX, HP-UX, Solaris, Linux and Windows-NT. You have to be careful of the in
which external libraries are initialised in each process that much attach to
the shared memory at the same virtual address. Working with "direct
pointers" in memory is pretty nice. I've seen it done with offsets, too.
It's pretty ugly because all code needs to have the magical global knowledge
about the base-address.

Maybe I don't understand what you mean by reliably.

> 4. Any crash of server process can corrupt the shared data structures,
> crashing all processes. While this is no worse than the threaded
> server cases, it is no better, either.

When using unsafe languages like C and C++, the threaded server case is
worse in that it allows one thread to corrupt not only shared data
structures but also "private" data of other threads.

> [snipped a bit]
> Your question really isn't about a multi-process server vs.
> SuperServer, but a multi-process server vs. a future fine grain
> multi-threaded, single process server. In my mind, there are
> two metrics to determine the preferred answer: Relative
> cost/risk of development and relative performance.

By fine-grained, I guess you mean fine grained locking of data structures?

> It is to make an accurate prediction on the relative costs. Interlocking
> the internal data structures for high performance, fine grained
> multi-threading is very, very difficult, particularly under extremely
> heavy load (Netfrastructures uses this architecture, so I have
> some experience with what I say). On the other hand, designing
> and implemented partially internal, partially shared cache
> manager is no picnic.

Is the Netfrastructure architecture single-process/multi-thread?
Is this the state-of-the-art architecture for a dbms?

> [snipped a bit more]

Cheers, Steve.