Subject Re: Some benchmarks about 'Order by' - temporary
Author m_theologos
--- In Firebird-Architect@yahoogroups.com, Pavel Cisar <pcisar@...>
wrote:
>
> Hi,
>
> Vlad Horsun wrote:
> >
> > What i mean is that sorted runs contains whole records and
during
> > merge phase we must read it from disk and write back merged
larger
> > runs. Formally we need only keys to compare and merge. Therefore
> > we have different sort time for records of different size but
with keys
> > of equal size. More record_size/key_size more performance loss we
> > have.
>
> Just an idea.
>
> In the project I was involved in the past that had to sort records
of
> various size in sets of various size we had a great success with
> building B-tree index on the fly + read in index order instead
sorting
> the set. Sort performance degrades rapidly with set size (size*row
> length in fact), while time requirements to build the index was
more or
> less linear with set size. Index was a balanced B-tree organized in
> pages of fixed size. Index key values weren't stored in index
directly,
> but as pointers to the row in base set. Index was hence very dense
and
> could be held in memory even for large sets. Of course, once the
set
> pages / clusters has to be flushed on disk, there was a penalty in
index
> building when set page has to be read to retrieve the key value,
but
> anyway, it was quite fast. Lightning fast when everything fits into
> memory (that are bigger and cheaper by every year), and I guess
that
> this strategy could be enhanced and optimized for better
performance
> when base set doesn't fit into memory (we didn't spend a lot of
time to
> try optimize it, as results we had were good enough for our needs).
>
> best regards
> Pavel Cisar
> IBPhoenix
>

FoxPro's Rushmore perhaps?... :) I'm kidding but not completely. This
was my first ideea adapted from FoxPro's engine. But as Pavel said,
it can be optimized, and, perhaps, cached for a next retreival until
the underlying table(s) change (when we must dispose it).

HTH,

m. th.