Subject | join now or later |
---|---|
Author | Adam |
Post date | 2005-05-26T04:11:05Z |
Hello Group,
I was wondering about the following overly simplified example.
If I had three tables
Site (ID,Name)
Employee (ID,Name)
SiteEmployee (SiteID,EmployeeID)
It may be necessary to run something like
select SiteID, EmployeeID
from SiteEmployee
where SiteID in (1,2,3)
Nothing to special here, but if I needed to also have the Site Name
and Employee Name for a report or something, there are two ways I
could approach this.
The easiest would be something like:
select se.SiteID, s.Name as SiteName, se.EmployeeID, e.Name as
EmployeeName
from SiteEmployee se
join Site s on (se.SiteID = s.ID)
join Employee e on (se.EmployeeID = e.ID)
where se.SiteID in (1,2,3)
The other way would be to bring back the first query, parse it into a
CSV full of SiteID and a CSV full of EmployeeID, then run separate
queries along the lines
Select ID, Name
from Site
where ID in ...
Select ID, Name
from Employee
where ID in ...
and marry them up in a client dataset or alike. Here is my thinking,
and please let me know if I am too sleep deprived. For datasets with
few duplicates, the first method will be quicker. For datasets with
more duplicates (eg: lots of employees at only one single site), the
cost of continuously returning the same "SiteName" will be more than
retrieving the smaller dataset and marrying them later.
I am further guessing that over a local TCP connection, there will be
little difference, but over a slow LAN is may be more noticable. I am
also guessing that using compression would make the difference
negligable. But I am doing too much guessing here. What approach would
you take?
I was wondering about the following overly simplified example.
If I had three tables
Site (ID,Name)
Employee (ID,Name)
SiteEmployee (SiteID,EmployeeID)
It may be necessary to run something like
select SiteID, EmployeeID
from SiteEmployee
where SiteID in (1,2,3)
Nothing to special here, but if I needed to also have the Site Name
and Employee Name for a report or something, there are two ways I
could approach this.
The easiest would be something like:
select se.SiteID, s.Name as SiteName, se.EmployeeID, e.Name as
EmployeeName
from SiteEmployee se
join Site s on (se.SiteID = s.ID)
join Employee e on (se.EmployeeID = e.ID)
where se.SiteID in (1,2,3)
The other way would be to bring back the first query, parse it into a
CSV full of SiteID and a CSV full of EmployeeID, then run separate
queries along the lines
Select ID, Name
from Site
where ID in ...
Select ID, Name
from Employee
where ID in ...
and marry them up in a client dataset or alike. Here is my thinking,
and please let me know if I am too sleep deprived. For datasets with
few duplicates, the first method will be quicker. For datasets with
more duplicates (eg: lots of employees at only one single site), the
cost of continuously returning the same "SiteName" will be more than
retrieving the smaller dataset and marrying them later.
I am further guessing that over a local TCP connection, there will be
little difference, but over a slow LAN is may be more noticable. I am
also guessing that using compression would make the difference
negligable. But I am doing too much guessing here. What approach would
you take?