firebird-architect - Re: Some benchmarks about 'Order by'

Subject	Re: Some benchmarks about 'Order by' - temporary
Author	m_theologos
Post date	2006-09-23T16:55:21Z

--- In Firebird-Architect@yahoogroups.com, Pavel Cisar <pcisar@...>
wrote:

>
> Hi,
>
> Vlad Horsun wrote:
> >
> > What i mean is that sorted runs contains whole records and

during

> > merge phase we must read it from disk and write back merged

larger

> > runs. Formally we need only keys to compare and merge. Therefore
> > we have different sort time for records of different size but

with keys

> > of equal size. More record_size/key_size more performance loss we
> > have.
>
> Just an idea.
>
> In the project I was involved in the past that had to sort records

> various size in sets of various size we had a great success with
> building B-tree index on the fly + read in index order instead

sorting

> the set. Sort performance degrades rapidly with set size (size*row
> length in fact), while time requirements to build the index was

more or

> less linear with set size. Index was a balanced B-tree organized in
> pages of fixed size. Index key values weren't stored in index

directly,

> but as pointers to the row in base set. Index was hence very dense

and

> could be held in memory even for large sets. Of course, once the

set

> pages / clusters has to be flushed on disk, there was a penalty in

index

> building when set page has to be read to retrieve the key value,

but

> anyway, it was quite fast. Lightning fast when everything fits into
> memory (that are bigger and cheaper by every year), and I guess

that

> this strategy could be enhanced and optimized for better

performance

> when base set doesn't fit into memory (we didn't spend a lot of

time to

> try optimize it, as results we had were good enough for our needs).
>
> best regards
> Pavel Cisar
> IBPhoenix
>

FoxPro's Rushmore perhaps?... :) I'm kidding but not completely. This
was my first ideea adapted from FoxPro's engine. But as Pavel said,
it can be optimized, and, perhaps, cached for a next retreival until
the underlying table(s) change (when we must dispose it).

HTH,

m. th.