Subject Re: [Firebird-Architect] Re: Some benchmarks about 'Order by' - temporary
Author Vlad Horsun
> Vlad Horsun wrote:
> >
> > This is known weak of our sorting algoritm. In-memory we do move
> > of pointers but on disk we move whole records. You can learn sort.cpp
> > to be sure
>
> Are you sure that's a weakness? True it makes sorting slower, but
> it avoids re-reading the records in sorted order which is much slower.
> We should only be carrying the fields that were requested, so the
> size of the record doesn't matter, only the number and size of the
> fields in the select list.

Yes, having all the select list's fields in sorted order is very good
thing as it avoid needs to reread it from tables again. This is strong
point no doubt.

What i mean is that sorted runs contains whole records and during
merge phase we must read it from disk and write back merged larger
runs. Formally we need only keys to compare and merge. Therefore
we have different sort time for records of different size but with keys
of equal size. More record_size/key_size more performance loss we
have. Often peoples need to rewrite queries to move some large
grouped fields to subselects, for example

select d.docid, d.comment, sum(l.amount)
from docs d join lines l on d.docid = l.lineid
group by d.docid, d.comment

rewritten as

select d.docid,
(select d2.comment from docs d2 where d.docid = d2.docid) as comment,
sum(l.amount)
from docs d join lines l on d.docid = l.lineid
group by d.docid

assume that docs.docid is primary key and docs.comment is large varchar field

Regards,
Vlad