Subject | |
---|---|
Author | Ann W. Harrison |
Post date | 2001-01-03T00:22:05Z |
At 02:43 PM 1/2/2001 -0700, Jason Wharton wrote:
From someone else:
been sorted, then there's a win here. If we're talking
about the highest or lowest valued 10k records out of
100M or more, then maybe an index walk is faster than
a sort.
I am anything but an academic computer scientist and
dragging out Knuth on Sorting and Searching has given
me a headache. My feeble recollection is that a series
of approximate sorts is n log n, while a straight insert
sort is closer to n * n.
There's probably something essential I've missed, but
identifying the highest (or lowest) 10K out of 100M
pretty much involves sorting the whole 100M - depending
on the order of input. You read in the first 10K, then
look at the next row - if it's lower than your high
value, throw out the low value and stick this one in.
Then find the new low value and repeat.
there is (was) code in the optimizer that recognized the
blr_first qualifier and used index navigation.
...
indexes are concerned.
these two lines from opt.c, routine gen_nav.
|| (ptr [sort->nod_count] && !(idx->idx_flags & idx_descending)) ||
(!ptr [sort->nod_count] && (idx->idx_flags & idx_descending)))
The sort tail is composed of pairs of value node pointers and true/false
flags indicated whether the column is ascending or descending. I think
the referencing here is wrong, but more likely (65%) I'm wrong and the
problem is slightly different.
Regards,
Ann
www.ibphoenix.com
We have answers.
Regards,
Ann
www.ibphoenix.com
We have answers.
From someone else:
> > >TOP {n} would be handy ... even though it would go thru allFrom me:
> > >records in the result set for ordering purposes, only those
> > >within the 10000 window would be kept in the sort file (other
> > >records would be discarded during the fetching).
> > That wouldn't work as well as you might think, givenFrom Jason Wharton:
> > the sort algorithm used. Besides, the major cost is
> > retrieval, not sorting.
>The person is only after a fixed number of records andIf the person wants an arbitrary 10K records that have
>has no interest in any beyond that amount. Certainly
>this could make one algorithm more favorable to justify
>this optimization. Scan the records in natural order
>and use less resources to contain the temporary result
>cache is the suggestion. Sounds like a great idea to me.
been sorted, then there's a win here. If we're talking
about the highest or lowest valued 10k records out of
100M or more, then maybe an index walk is faster than
a sort.
I am anything but an academic computer scientist and
dragging out Knuth on Sorting and Searching has given
me a headache. My feeble recollection is that a series
of approximate sorts is n log n, while a straight insert
sort is closer to n * n.
There's probably something essential I've missed, but
identifying the highest (or lowest) 10K out of 100M
pretty much involves sorting the whole 100M - depending
on the order of input. You read in the first 10K, then
look at the next row - if it's lower than your high
value, throw out the low value and stick this one in.
Then find the new low value and repeat.
> > >If ordering can be done *without* a temporary result set,Yes, I've understood that. I believe Jim has said that
> > >e.g. via index navigation, you would get results very
> > >quickly (minimal processing before returning a record).
> > >Can the database engine currently handle this?
> >
> > Yes, and for large result sets it is often SLOWER to
> > finish than a sort. The first row comes faster, but
> > the total time is LONGER. Really and truly. Tested
> > and verified.
>
>Right, we aren't questioning that. We want the first row(s)
>as fast as possible.
there is (was) code in the optimizer that recognized the
blr_first qualifier and used index navigation.
...
>Leave things as they are now, just giveFine. I think we're in violent agreement where navigational
>us the option to tell the server to act the way we want in our circumstances
>that warrant it.
indexes are concerned.
>PS. I sure would like to hear back from you concerning the compoundThere's an 85% probability that the bug is in (or indicated by)
>descending order by optimization issues...
these two lines from opt.c, routine gen_nav.
|| (ptr [sort->nod_count] && !(idx->idx_flags & idx_descending)) ||
(!ptr [sort->nod_count] && (idx->idx_flags & idx_descending)))
The sort tail is composed of pairs of value node pointers and true/false
flags indicated whether the column is ascending or descending. I think
the referencing here is wrong, but more likely (65%) I'm wrong and the
problem is slightly different.
Regards,
Ann
www.ibphoenix.com
We have answers.
Regards,
Ann
www.ibphoenix.com
We have answers.