Subject Processes, Threads, and Synchronization
Author Jim Starkey
At 12:34 AM 9/18/02 +1000, Steven Shaw wrote:
>Jim, do your think that a network accessible page/lock server is a good
>idea? By "good reasons" I can't tell whether you mean it was a bad idea or
>if priorities lay elsewhere.

It was probably an OK idea for a secure LAN (this was all pre-Internet).
The thought of throwing around raw pages on networks separated only
by a few microns of silicon from 14 million malicious hackers causes
the fur to go up on the back my neck.

The reason it was attractive is we were starved for CPU cycles
and wanted to take advantages of the compute power sitting around
the LAN. But there is now about as much compute in my notebook
as Sun's installed base in 1985.

>Are you saying that AT&T developers were dim to require that shared memory
>pages remain in physical memory? Should they be allowed to swap out? Does it
>matter anyway since, as you point out, memory is essentially free?

Memory is cheap, but address space is precious. If software were
as wasteful with physical memory as they are with virtual memory,
nothing would get done. Non-paged physical memory must be
managed as a scarce resource.

>I know I need to go back to school but could you exludiate (just a little)
>on why intra-process locks are quicker than inter-process locks. Cheers.

First, I think we agree that any excursion into an OS kernel is
expensive -- change in protection ring, virtual address space,
flushing or invalidating the cache, flushing the instruction
pipeline, validating and probing argument addresses, etc.

Inter-process synchronization requires kernel intervention. A
kernel primitive is better that a process level wait/wakeup
mechanism as it gives better information the scheduler, eliminating
unnecessary process wakeups and context switches.

Intra-process (inter-thread) synchronization has shared memory
available to it permitting synchronization mechanisms that can
handle a subset of contention modes without resorting to a kernel
call. An example is read/read contention without a writer, which
happens to be the most common case when synchronizing data structures.

One could do the same sort of inter-thread synchronization to
do inter-process synchronization, but from an OS perspective, it
violates the generally accepted standards for process isolation.

Linux pthreads operate without significant OS support other than
allowing processes to share a single virtual address space. You
might take a look at what the pthread mutex implementation -- very
ughly, very inefficient. Although Linux context switches at twice
the rate of NT, NT outperforms Linux on a heavily threaded load
by at least 20% due to cheaper inter-thread synchronization
(criticial section v. pthread mutexes).

>I have seen shared memory mapped to a fixed virtual location work across
>AIX, HP-UX, Solaris, Linux and Windows-NT. You have to be careful of the in
>which external libraries are initialised in each process that much attach to
>the shared memory at the same virtual address. Working with "direct
>pointers" in memory is pretty nice. I've seen it done with offsets, too.
>It's pretty ugly because all code needs to have the magical global knowledge
>about the base-address.


>When using unsafe languages like C and C++, the threaded server case is
>worse in that it allows one thread to corrupt not only shared data
>structures but also "private" data of other threads.

C and C++ are no less safe than another other language. Non-
synchronized data structures are just as dangerous in Java as
C or C++. That said, C and C++ allow choice of synchronization
mechanism, allowing a multi-state lock rather than a simple
mutex, while Java has mutexes hard wired.

>> [snipped a bit]
>> Your question really isn't about a multi-process server vs.
>> SuperServer, but a multi-process server vs. a future fine grain
>> multi-threaded, single process server. In my mind, there are
>> two metrics to determine the preferred answer: Relative
>> cost/risk of development and relative performance.
>By fine-grained, I guess you mean fine grained locking of data structures?

Exactly. The finer the granuality, the less the likelihood of actually
contention, the greater the parallelism on SMP, and the lower the
rate of thread switches for synchronization verywhere.

>Is the Netfrastructure architecture single-process/multi-thread?
>Is this the state-of-the-art architecture for a dbms?

Netfrastructure is single process/multi-thread server. Unlike
most database engines, it also contains a Java virtual machines,
a search engine, and an elaborate page generation engine.

Netfrastructure has synchronization objects on about two
dozen classes for a total about a 30 types of things being
synchronized. The synchronization object resolves read/read
contention (sic!) with interlocked increment/decrement
instructions, resorting to pthread and critical section
synchronization of thread wait queues. The goal, usually
achieved, is to satisfy a web request without a thread
or context switch.

Netfrastructure uses a "in memory database" model using the
disk as a persistent backfill. While it has a page cache,
records rather than page images reside in memory. And,
while it support multi-generational records, it does so
in memory. The on-disk structure is not multi-generational.

Jim Starkey