Subject Re: [Firebird-Architect] Re: Some benchmarks about 'Order by' - temporary
Author Vlad Horsun
> Err, actually, Jim and I just talked about the way sort works,
> and unless something has changed, nothing is "moved around on
> disk". When a sort run is larger than the memory allowed, it
> is written out in sorted order. When all records have been
> sorted, the runs are merged. If that's right, then runs are
> both read and written sequentially, which seems like the best
> of both worlds... given that you've got a world where the
> sorted set doesn't fit in memory.
>
> Where have I gone wrong?

You not wrong

Weakness is that sorted runs contains not only key fields but
all other fields from select list. For example

SELECT * FROM RDB$RELATIONS
ORDER BY RDB$RELATION_ID

here we have sort record size 472 bytes but only 2 keys (one for
RDB$RELATION_ID and one for NULL flag) are actually needed
for sorting (8 bytes)

Our sorted runs will contain sort records of 119 longs but to sort
(and merge) we need only 2 longs. Hence we move to\from disk much
more amount of data than we can while merging runs.

If we separate sort record on two parts {key, offset} and
{not_key_fields} and sort only first part we will do much less work at
merge stage but final stage will cost more as we will load {not_key_fields}
in random order from secondary temp file.

Must say that final stage of our current sort algorithm also included
much random IO - we store sorted runs of size of

MAX_SORT_BUFFER_SIZE * RUN_GROUP ^ MAX_MERGE_LEVEL

currently it is 128KB * 8 ^ 2 = 8MB

Therefore even medium sized sorts (say 500MB-1GB) will consist
from 63 - 128 sorted runs. Call to SORT_get make merge on the fly.
This leads to random reads from 63-128 places of temp file - it is very
slow especially at the first call to SORT_get as we must read first
records from each run before we can return data to user

Regards,
Vlad