Subject Re: [IB-Architect] Re : License Question
Author Emil Briggs
Jim Starkey wrote:
> >OK. Now things are starting to make sense. I would argue though
> >that a process that holds a lock that it doesn't need is a bug
> >that needs fixing. Unless there is some other reason for doing it
> >that way?
> >
> Indeed there is, sir! Database systems regularly revisit certain
> pages. Interbase, for examples, uses "pointer pages" to keep track
> of data pages assigned to a particular table; a reference to a
> particular record goes through to pointer to the data page on which
> the record (or at least the head of the record) resides. To actually
> construct the record a fragmented tail chain may be traversed, old
> version pointer followed, blob references followed (you get the
> picture). If the database system had to read every page from the
> disk on every reference, it would be slow like pig (like Oracle?).
> So Interbase, like every other database in the world (save, maybe
> MySQL) keeps a page cache. Interbase, quite reasonably, keeps
> a page in cache, locked of course, until either somebody else wants
> it (a blocking AST) or Interbase has a better use for the cache
> buffer. It is much, much faster to keep stuff in cache and
> releasing it on demand than to read it each time.

Thanks for the explanation but I'm afraid you're misunderstanding me.
I'm not suggesting that the pages should be read from disk
on every reference. However the generic lock manager setup that
you described for Unix sounds clumsy and overly complicated.
Assuming a page is in cache why should it be locked if no one is
using it? Maybe I'm misunderstanding your description but it sounds
like you were saying that if one process acquires the lock it holds
it until someone else signals it that it wants it. Even if the
first process doesn't need the lock anymore it still keeps it.

> >
> >It's more a question of price/performance. If I can use 6 dual CPU
> >machines in a cluster it's a lot more cost effective than a
> >single 8 CPU machine. That's not always possible even in HPC
> >applications but it's nice if you can get it.
> >
> Do remember that the goal of a database designer is to bottleneck
> at available disk bandwidth. Disk are the only part of computing
> that hasn't gotten significantly faster (ok, transfer rates are
> up, but rotational delay and seek times are at best 2X what they
> were when you were born). As long as there are enough CPU

I don't think so. (I was born in 1961 and disks have gotten much
more than 2X faster since then!).

> cycles to saturate the disk more doesn't make it faster (given
> classical database architecture). Modern uniprocessors are
> unbelievably fast; multi-processors boggle the mind. A cluster
> architecture is just not needed to supply the cycles to clog
> the disk channel. A cluster architecture does impose a signicant
> tax on every lock, read, and write, so the net performance gain
> on a cluster for a well written* database system is likely to
> be negative. Sorry.

No. The disk channel is not the bottleneck in all applications.
In some web applications the database is read mostly and large
parts of it can be cached in RAM. In this case the bottleneck
is more likely to be CPU or memory bandwidth. Memory bandwidth is
a serious issue with multiprocessor systems and is one reason
why clusters are so useful in HPC. (The memory bandwidth scales
linearly with the number of nodes -- to do this with an SMP machine
is very expensive and is in fact impossible after about 64 CPUS's
with current technology). An additional advantage of a cluster
architecture is redundancy in case of hardware failures
which is a very important consideration. Companies are willing
to pay big $$$$ for 99.9% and better availability which is a
big incentive for developing cluster technologies.