Subject | Re: [firebird-support] Re: TempDirectories and ALTER INDEX ACTIVE |
---|---|
Author | Mark Rotteveel |
Post date | 2012-06-10T11:06:24Z |
On 10-6-2012 12:32, Dmitry Yemanov wrote:
(and the additional time required to perform the sort), then I do wonder
if the advantage is really that big.
The build of a balanced b-tree(*) is O(n log n), and the 'good' case of
sorting is O(n log n) to O(n^2) (ideal is O(n)). Sorting before
insertion might only be worth the effort if the sort can fully occur in
memory. The additional overhead of writing to disk during the sort and
reading from disk for insertion might negate most of the advantages
against the overhead of inserting into the b-tree unsorted (ie balancing
etc).
It might be worth to investigate this tradeoff, or just to provide an
option to rebuild indices without using temporary sort spaces so others
can measure it for themselves (or to have a workaround if disk space is
not large enough to accommodate the rebuild).
*) I am assuming FB uses a balanced trees, because sorting first would
otherwise degenerate it into a linked list
--
Mark Rotteveel
> 10.06.2012 14:09, Mark Rotteveel wrote:If the trade off is requiring a large part of memory or disk space extra
>>
>> * for each record (sequential read of all pages?)
>> * insert column value into index
>
> Inserting unsorted / random values into the b-tree is known to be much
> slower than sorting the values in advance and loading them into the
> b-tree "in order".
(and the additional time required to perform the sort), then I do wonder
if the advantage is really that big.
The build of a balanced b-tree(*) is O(n log n), and the 'good' case of
sorting is O(n log n) to O(n^2) (ideal is O(n)). Sorting before
insertion might only be worth the effort if the sort can fully occur in
memory. The additional overhead of writing to disk during the sort and
reading from disk for insertion might negate most of the advantages
against the overhead of inserting into the b-tree unsorted (ie balancing
etc).
It might be worth to investigate this tradeoff, or just to provide an
option to rebuild indices without using temporary sort spaces so others
can measure it for themselves (or to have a workaround if disk space is
not large enough to accommodate the rebuild).
*) I am assuming FB uses a balanced trees, because sorting first would
otherwise degenerate it into a linked list
--
Mark Rotteveel