Subject Re: Some benchmarks about 'Order by' - temporary indexes?...
Author m_theologos
--- In Firebird-Architect@yahoogroups.com, Jim Starkey <jas@...>
wrote:

> If the only overhead were the seek and rotational latency of the
hard
> disk, I wouldn't quibble at all. But a Firebird record fetch
requires
> the following:
>
> * A fetch of a pointer page, which might, though probably not,
> require a page. It does require an interlocked trip through
the
> page cache manager, tweaking data structure, maybe incurring
aged
> page writes, etc.
> * A handoff from the pointer page to the data page. This is a
> fetch of the data page (see above) and a release of the
pointer
> page (another trip through the page cache manager.
> * Fetch of the record header
> * A decision on whether to fetch another version, which would
> require another pair of release/fetch trips through the page
cache
> manager.
> * Decompression of the record
> * Reassembly of the record if it's fragmented (more trips
through
> the page cache manager)
> * Release of the data page (yet another trip through the page
cache
> manager)
>
> Each disk read incurs at least the following:
>
> * An OS call to initiate the read. The OS passes the operation
to
> the driver to initiate or queue it
> * A thread stall and switch
> * If no runnable thread, a process context switch
> * When the read completes, maybe a process context switch back
> * A thread switch back
>
> In the sort alternative, it's a memory reference.
>
> When I original wrote Interbase, I sweated every cycle in the page
cache
> manager. I sweated it all over again on Vulcan. As the system has
> grown in complexity, so has the complexity of the page cache
manager,
> with a resulting drop in performance. The page cache manager is
the
> core and the throttle of Firebird/Interbase.
>
> Falcon was designed to circumvent the problem. Rather than using
memory
> to hold more pages (an inefficient cache), it uses memory to hold
whole
> records. This makes a record reference
>
> * A single shared read/write lock executed in user mode
> * An indexed walk down a fixed width, variable depth tree
> * Increment of an interlocked record use count
> * Release of the read/write lock in user mode
>
> The basic record reference cycle is probably a hundreth (or less)
than
> the Firebird equivalent without a disk read or write. A cache
miss,
> however, isn't much different than Firebird.
>
> Memory used to be expensive and limited in size and address space.
Now
> it's cheap, really fast, and huge.
>
> About a million years ago I attended a talk on memory hierarchies.
The
> guy argued that there was an inherent pyramid of memory references
with
> different tradeoffs of size of speed: Register, cache, main
memory,
> disk. What has happened in those million years is that the shape
of the
> pyramid has changed. It's still a pyramid, but it's shorter and
wider.
> The speed difference between cache and main memory is narrowing
while
> the relative sizes are changing dramatically. Main memory is
radially
> faster. Disks are almost unchanged (OK, solid state disks change
this
> -- somewhat). Intelligent design dictates that when the
environment
> changes, so must the design. In other words, the intelligent use
of
> more really fast memory is not just "more page buffers."
>
> Getting back to your argument, the real impact of solid state disks
are
> the serial logs, not database pages. The best (known) way of
pumping
> transactions it to write all updates to a single serialized log
> periodically flushed serially with non-buffered writes, letting a
> background thread translate the updates to the on-disk structure.
With
> batch commits, there are fewer disk writes than transactions. And
if
> each disk write is a non-buffered write to a sequential file on a
solid
> state disk, well...
>
> We're trying to get a Falcon alpha released open source in next
month or
> two. Some of it will look familiar, particularly to developers
working
> on Vulcan. The rest of it, I hope, will be a look at alternatives
made
> possible by quantitative changes in computer architecture.
>
> --
>
> Jim Starkey
> Netfrastructure, Inc.
> 978 526-1376
>

I think that's much more efficient to have INFORMATIONS in cache (ie.
entire records) rather than DATA which must be converted, thus, as
Jim said, will have many side effects on speed and system complexity.

Just my 2c,

m. Th.