Subject | RE: [ib-support] COUNT DISTINCT on a composite key |
---|---|
Author | Robert DiFalco |
Post date | 2003-05-02T06:37:12Z |
Actually, I simplified that actual problem a little too much.
I have a Links table that keeps Edges of a tree -- parents and children
where the children themselves could be group or leaf nodes.
I then have a Paths table that keeps all ancestor and descendent paths
along with a depth of the path. To keep this table from getting too big,
it does not have the actual edges, those are in the Links table.
So consider a tree something like this:
Root
/ \
Group1 Group2
/ \
Leaf1 Leaf1
I have a table called "Groupable" for the objects I am grouping. I'll
keep them simple for now but they have many other fields.
ID Name Description
=== ======== ===========
0 Root blah
1 Group1 ...
2 Group2 ...
3 Leaf1 ...
The Links table would look something like this:
ParentID ChildID
======== =======
0 1
0 2
1 3
2 3
The Path Table would have something like this:
AncestorID DescendantID Depth
========== ============ =====
0 1 1
0 2 1
This isn't a great example because there are only to levels of groups.
But each path (besides the edges) it put into this table. If I had a
group above "Root" with an ID of -1, the table would look like this:
AncestorID DescendantID Depth
========== ============ =====
-1 0 1
-1 1 2
-1 2 2
0 1 1
0 2 1
But for now let's assume I don't. So, to get a count (or groupable
records) of all the direct "Parents" of Leaf1 (i.e. Group1 and Group2) I
would do this:
SELECT [columns | COUNT(*)]
FROM Groupable JOIN Link ON Groupable.ID = Link.ParentID
WHERE ChildID = 3;
If I want to get a count (or matching Groupable records) for all
"Ancestors" of Leaf1 (i.e. Group1, Group2, and Root) I would do this:
SELECT [columns | COUNT(*)]
FROM Groupable
JOIN Paths ON Groupable.ID = Paths.AncestorID
FROM Paths
JOIN Links ON Paths.DescendentID = Links.ParentID
WHERE Link.Child = 3;
Well, damn, I get 4 instead of 3 because "Leaf1" will appear twice (as
it has two paths to it).
I can fix it by doing this:
SELECT [DISTINCT columns | COUNT(DISTINCT Groupable.ID)]
FROM Groupable
JOIN Paths ON Groupable.ID = Paths.AncestorID
FROM Paths
JOIN Links ON Paths.DescendentID = Links.ParentID
WHERE Link.Child = 3;
But this is my problem. What if I happened to have a composite Primary
Key in Groupable so that in Paths I have two columns for Ancestor and
two for Descendent, in Links two for Parent and two for Child where I
need both to create a Unique key? As far as I know, the SQL standard
does not support SELECT COUNT(DISTINCT col1, col2). I suppose I could do
a nested SELECT where the count does a "SELECT ... IN" but that seems
heavy handed. I hope I got the explanation correct. Am I making sense?
R.
-----Original Message-----
From: Helen Borrie [mailto:helebor@...]
Sent: Thursday, May 01, 2003 5:12 PM
To: ib-support@yahoogroups.com
Subject: Re: [ib-support] COUNT DISTINCT on a composite key
At 04:44 PM 1/05/2003 -0700, you wrote:
But Select count(*) is all you need. The PK is by definition unique.
Your
table will have exactly as many rows as there are distinct occurrences
of
the key elements.
heLen
I have a Links table that keeps Edges of a tree -- parents and children
where the children themselves could be group or leaf nodes.
I then have a Paths table that keeps all ancestor and descendent paths
along with a depth of the path. To keep this table from getting too big,
it does not have the actual edges, those are in the Links table.
So consider a tree something like this:
Root
/ \
Group1 Group2
/ \
Leaf1 Leaf1
I have a table called "Groupable" for the objects I am grouping. I'll
keep them simple for now but they have many other fields.
ID Name Description
=== ======== ===========
0 Root blah
1 Group1 ...
2 Group2 ...
3 Leaf1 ...
The Links table would look something like this:
ParentID ChildID
======== =======
0 1
0 2
1 3
2 3
The Path Table would have something like this:
AncestorID DescendantID Depth
========== ============ =====
0 1 1
0 2 1
This isn't a great example because there are only to levels of groups.
But each path (besides the edges) it put into this table. If I had a
group above "Root" with an ID of -1, the table would look like this:
AncestorID DescendantID Depth
========== ============ =====
-1 0 1
-1 1 2
-1 2 2
0 1 1
0 2 1
But for now let's assume I don't. So, to get a count (or groupable
records) of all the direct "Parents" of Leaf1 (i.e. Group1 and Group2) I
would do this:
SELECT [columns | COUNT(*)]
FROM Groupable JOIN Link ON Groupable.ID = Link.ParentID
WHERE ChildID = 3;
If I want to get a count (or matching Groupable records) for all
"Ancestors" of Leaf1 (i.e. Group1, Group2, and Root) I would do this:
SELECT [columns | COUNT(*)]
FROM Groupable
JOIN Paths ON Groupable.ID = Paths.AncestorID
FROM Paths
JOIN Links ON Paths.DescendentID = Links.ParentID
WHERE Link.Child = 3;
Well, damn, I get 4 instead of 3 because "Leaf1" will appear twice (as
it has two paths to it).
I can fix it by doing this:
SELECT [DISTINCT columns | COUNT(DISTINCT Groupable.ID)]
FROM Groupable
JOIN Paths ON Groupable.ID = Paths.AncestorID
FROM Paths
JOIN Links ON Paths.DescendentID = Links.ParentID
WHERE Link.Child = 3;
But this is my problem. What if I happened to have a composite Primary
Key in Groupable so that in Paths I have two columns for Ancestor and
two for Descendent, in Links two for Parent and two for Child where I
need both to create a Unique key? As far as I know, the SQL standard
does not support SELECT COUNT(DISTINCT col1, col2). I suppose I could do
a nested SELECT where the count does a "SELECT ... IN" but that seems
heavy handed. I hope I got the explanation correct. Am I making sense?
R.
-----Original Message-----
From: Helen Borrie [mailto:helebor@...]
Sent: Thursday, May 01, 2003 5:12 PM
To: ib-support@yahoogroups.com
Subject: Re: [ib-support] COUNT DISTINCT on a composite key
At 04:44 PM 1/05/2003 -0700, you wrote:
>I was wondering how to perform a distinct count when my table has awork:
>composite primary key. I naively thought something like this might
>Are you saying this doesn't work? If not, what's the exception?
>SELECT COUNT( DISTINCT keyColumn1, keyColumn2 ) ....
But Select count(*) is all you need. The PK is by definition unique.
Your
table will have exactly as many rows as there are distinct occurrences
of
the key elements.
heLen