Subject | RE: [firebird-support] Re: question about optimizing group by |
---|---|
Author | Svein Erling Tysvær |
Post date | 2008-11-21T14:14:01Z |
Does it matter (performance wise) whether an index is used for company_name)? It is easy for you to check, just see if your original query performs similar to
select
c.company_id,
c.company_name || '',
count(1) as employee_count
from
companies c
inner join company_employees ce on ce.company_id = c.company_id
group by 1, 2
since this cannot use any index for c.company_name (you probably have to run both statements several times just to ascertain that cache doesn't come into play). Normally, I would either use your original query or
select
c.company_id,
c.company_name,
(select count(1) from company_employees ce where ce.company_id = c.company_id) as employee_count
from
companies c
to get the number you want. I doubt there is much difference between these options, I would never choose your alternative 1, don't (yet) see any benefit of theoretical alternative 3 (I expect it to be very similar to alternative 1 if the field is indexed), but could possibly use alternative 2 (using the WITH equivalent rather than the subselect in the FROM clause - simply because I think this notation is easier to read and nest) to simplify more complex queries.
Set
-----Original Message-----
From: firebird-support@yahoogroups.com [mailto:firebird-support@yahoogroups.com] On Behalf Of woodsmailbox
Sent: 21. november 2008 14:18
To: firebird-support@yahoogroups.com
Subject: [firebird-support] Re: question about optimizing group by
Ok, sorry, I stripped down the query to get understood to the point
that it didn't make sense. Please let me try again.
Below are 4 equivalent selects. Question is, wouldn't the last query
perform faster than the others?
select
c.company_id,
c.company_name,
count(1) as employee_count
from
companies c
inner join company_employees ce on ce.company_id = c.company_id
group by
c.company_id,
c.company_name -- SUPERFLOUS INCLUSION OF company_name in SORT
TABLE or INDEX SCAN
ALTERNATIVE SYNTAX #1
select
c.company_id,
min(c.company_name), -- SUPERFLOUS INDEX SCAN TO COMPUTE min()
count(1) as employee_count
from
companies c
inner join company_employees ce on ce.company_id = c.company_id
group by
c.company_id
ALTERNATIVE SYNTAX #2
select
c.company_id,
c.company_name,
t.employee_count
from
companies c
inner join (select company_id, count(1) as employee_count from company_employees group by company_id)
group by
c.company_id
ALTERNATIVE SYNTAX #3 with a hypothetical first() aggregate function:
select
c.company_id,
first(c.company_name),
count(1) as employee_count
from
companies c
inner join company_employees ce on ce.company_id = c.company_id
group by
c.company_id
select
c.company_id,
c.company_name || '',
count(1) as employee_count
from
companies c
inner join company_employees ce on ce.company_id = c.company_id
group by 1, 2
since this cannot use any index for c.company_name (you probably have to run both statements several times just to ascertain that cache doesn't come into play). Normally, I would either use your original query or
select
c.company_id,
c.company_name,
(select count(1) from company_employees ce where ce.company_id = c.company_id) as employee_count
from
companies c
to get the number you want. I doubt there is much difference between these options, I would never choose your alternative 1, don't (yet) see any benefit of theoretical alternative 3 (I expect it to be very similar to alternative 1 if the field is indexed), but could possibly use alternative 2 (using the WITH equivalent rather than the subselect in the FROM clause - simply because I think this notation is easier to read and nest) to simplify more complex queries.
Set
-----Original Message-----
From: firebird-support@yahoogroups.com [mailto:firebird-support@yahoogroups.com] On Behalf Of woodsmailbox
Sent: 21. november 2008 14:18
To: firebird-support@yahoogroups.com
Subject: [firebird-support] Re: question about optimizing group by
Ok, sorry, I stripped down the query to get understood to the point
that it didn't make sense. Please let me try again.
Below are 4 equivalent selects. Question is, wouldn't the last query
perform faster than the others?
select
c.company_id,
c.company_name,
count(1) as employee_count
from
companies c
inner join company_employees ce on ce.company_id = c.company_id
group by
c.company_id,
c.company_name -- SUPERFLOUS INCLUSION OF company_name in SORT
TABLE or INDEX SCAN
ALTERNATIVE SYNTAX #1
select
c.company_id,
min(c.company_name), -- SUPERFLOUS INDEX SCAN TO COMPUTE min()
count(1) as employee_count
from
companies c
inner join company_employees ce on ce.company_id = c.company_id
group by
c.company_id
ALTERNATIVE SYNTAX #2
select
c.company_id,
c.company_name,
t.employee_count
from
companies c
inner join (select company_id, count(1) as employee_count from company_employees group by company_id)
group by
c.company_id
ALTERNATIVE SYNTAX #3 with a hypothetical first() aggregate function:
select
c.company_id,
first(c.company_name),
count(1) as employee_count
from
companies c
inner join company_employees ce on ce.company_id = c.company_id
group by
c.company_id