Subject Re[2]: [IB-Architect] IB/FB Lock manager failures
Author Nickolay Samofatov
Hello Jim,

Tuesday, September 17, 2002, 6:09:34 PM, you wrote:

> And, incidentally, there are smarter ways to use large memory
> than a bigger disk cache.

But you will never be able to do this task faster than OS.
Because it can directly utilize hardware resources for that.
I think buffer cache should not even try to do this.

> All excursions into the OS are expensive -- arguments probed,
> page tables shuffled, kernel synchronized with other processes,
> all sorts of nasty overhead. Yes, page delivery by tweaking
> the page table is faster than a long memory to memory move for
> the systems that support it. As an historical aside, Interbase's
> immediate precusor, Rdb/ELN, ran exactly that way on Cutler's
> Elan operating system. But I don't think Unix system work that
> way, and even if they did, a system service call to fetch the
> page is still vastly more expensive than the 98% probability
> that the page is in the buffer pool.

Could you look at this topic from another point ? All process memory
is mapped via hardware. If you add one more shared R/W area to process
memory it will not differ in any way from all other memory. You can use
process bounds as an easy and natural way of process private data incapsulation.

Shared memory is essentialy the same memory as local process memory.
It shouldn't work slower or faster than local process memory.

All os's allow you to construct process space as you wish with your module base
addresses. You can even create process from scratch. Mapping memory to the same
location in several processes should not be a problem.

Multi-state lock is usually constucted from 1-2 mutexes and small area
of memory (to form spinlock and save state).
Nothing prohibits this memory to be shared. Timings should be the
same. When synchronizing on such locks no system calls are
involved in most cases.

> I don't think so. What makes a cluster a cluster is a distributed
> lock managed with automatic failover. Perhaps I've missed something,
> but VMS is the only platform I've heard of that support a distributed
> lock manager. I understand IBM is sponsoring an open source DLM
> project for Linux modelled on the VMS lock mangler, but falls short.
> I don't know whether it could be used or not.

I don't think that lock manager is that important. Rich IPC API
form a cluster. Fail-over is realatively easy to to do in this case.

> My point was that you can't just move a little bit into shared memory,
> you must move a great deal. And unless all processes are running
> trusted server code, you must allow for failover.

> The code base is already a piece of c***p with massive conditionalizations
> for classic and superserver. Adding another major variant within
> the same code base could be fatal. If you're going to procede, do
> plan to fork the code.

I don't want to spawn another Interbase variant. I don't have
resources for that. I just want to fix current CS problems and
make it work faster without breaking things much.

I think that creating SuperServer in its current incarnation was a
mistake and it will not be reworked to MT-safe variant any soon.
But we need (and have) stable scalable version now. Why not make it
better ? MT-variant is also good to have (combined
multi-thread/multi-process approach offers best scalability)
but it is not so important if there will be CS version on Windows.

Saying more, if we fix CS version and drop all SS stuff from codebase
we can make MT-safe version very quickly. Just allow serveral threads
running in the process and make all global data thread-local.
Than we can do some performance tuning allowing threads to share
common structures with fine-grained synchronization.

And our version will be stable all the time. It is much, much
easier way than trying to fix SS.

Best regards,
Nickolay mailto:skidder@...