Subject Re: question about optimizing group by
Author woodsmailbox
Thanks Ann, just a few more comments.

You said:
> Carrying the company name around in the sort costs almost nothing -
this (...) The sort comparison will check a few extra bytes... which
costs nothing compared with reading every row in each table.

A real life query would involve many more columns than just
company_name in the group-by list, and it would generally include a
where clause. Does that change the story for the optimizer? i.e. would
the large column number in the group-by affect speed then? could it
hit any limits for sort tables, etc.?

>
> One thing to remember is that sorting data in memory is much faster
> than reading it off disk, despite the nlog(n) factor.

Assume the db and tmp directory sits on a ramdrive, so I only worry
about the actual optimization effort.

> Aside from the time it will take to implement the hypothetical first
> function, this one isn't much better than the original.

Why?

And why I'm asking: If indeed, the speed isn't affected by the number
and size of columns in the group-by clause (and whether there are
compound indexes or not on those columns), then I wouldn't be looking
for a solution. But that's exactly what I'm not convinced of yet,
since I got quite different results with the different alternatives I
tried.