Subject | Performance of recursive table |
---|---|
Author | chrisacron |
Post date | 2006-11-24T08:37:29Z |
Hi all
In my application I use a self-related table to define complex
classifications. The number of levels are theoretically unlimited but
in reality will probably be 10-15 maximum.
My question is whether the following query is prone for performance
issues (because of the multilevel joining of the alliases of the
classification table to the same table):
SELECT PT.ID, PT.Description
FROM PARTNERS PT
WHERE EXISTS(SELECT ID FROM Partners
LEFT OUTER JOIN Areas A1 ON A1.ID = PT.AreaId
LEFT OUTER JOIN Areas A2 ON A2.ID = A1.ParentId
LEFT OUTER JOIN Areas A3 ON A3.ID = A2.ParentId
LEFT OUTER JOIN Areas A4 ON A4.ID = A3.ParentId
LEFT OUTER JOIN Areas A5 ON A5.ID = A4.ParentId
LEFT OUTER JOIN Areas A6 ON A6.ID = A5.ParentId
LEFT OUTER JOIN Areas A7 ON A7.ID = A6.ParentId
LEFT OUTER JOIN Areas A8 ON A8.ID = A7.ParentId
WHERE QueryAreaId IN (A1.ID, A2.ID, A3.ID, A4.ID, A5.ID, A6.ID,
A7.ID, A8.ID))
QueryAreaId can be any of the nodes on the tree and I need to get all
records related to that node and any of the children. The query would
most often be more complex than that. For example requesting
aggregates from detail transactional tables where the header has a
reference to the partner which in turn links to my classification
table(s).
Also, there will be several of these self related tables that I want
to use the same sort of structure for the filtering of the detail
lines. Is it feasable to have several of these related tree sections
(multiple joins on aliases on a id = parent_id basis) in the query?
What is the implication for tables with millions of records in them
feeding into self-related tables with a few thousand records at up to
15 levels?
Another method that runs syntactically ok but I have no idea what
performace would be like:
SELECT PT.ID, PT.Description
FROM PARTNERS PT
LEFT OUTER JOIN AREAS A1 ON A1.ID = PT.AreaId
LEFT outer join AREAS A2 ON A2.ID = A1.ParentId
LEFT outer join AREAS A3 ON A3.ID = A2.ParentId
LEFT outer join AREAS A4 ON A4.ID = A3.ParentId
LEFT outer join AREAS A5 ON A5.ID = A4.ParentId
LEFT outer join AREAS A6 ON A6.ID = A5.ParentId
LEFT outer join AREAS A7 ON A7.ID = A6.ParentId
LEFT outer join AREAS A8 ON A8.ID = A7.ParentId
WHERE QueryAreaId IN (A1.ID, A2.ID, A3.ID, A4.ID, A5.ID, A6.ID,
A7.ID, A8.ID)
Is there another way of doing this?
Kind regards
Chris Laurie
In my application I use a self-related table to define complex
classifications. The number of levels are theoretically unlimited but
in reality will probably be 10-15 maximum.
My question is whether the following query is prone for performance
issues (because of the multilevel joining of the alliases of the
classification table to the same table):
SELECT PT.ID, PT.Description
FROM PARTNERS PT
WHERE EXISTS(SELECT ID FROM Partners
LEFT OUTER JOIN Areas A1 ON A1.ID = PT.AreaId
LEFT OUTER JOIN Areas A2 ON A2.ID = A1.ParentId
LEFT OUTER JOIN Areas A3 ON A3.ID = A2.ParentId
LEFT OUTER JOIN Areas A4 ON A4.ID = A3.ParentId
LEFT OUTER JOIN Areas A5 ON A5.ID = A4.ParentId
LEFT OUTER JOIN Areas A6 ON A6.ID = A5.ParentId
LEFT OUTER JOIN Areas A7 ON A7.ID = A6.ParentId
LEFT OUTER JOIN Areas A8 ON A8.ID = A7.ParentId
WHERE QueryAreaId IN (A1.ID, A2.ID, A3.ID, A4.ID, A5.ID, A6.ID,
A7.ID, A8.ID))
QueryAreaId can be any of the nodes on the tree and I need to get all
records related to that node and any of the children. The query would
most often be more complex than that. For example requesting
aggregates from detail transactional tables where the header has a
reference to the partner which in turn links to my classification
table(s).
Also, there will be several of these self related tables that I want
to use the same sort of structure for the filtering of the detail
lines. Is it feasable to have several of these related tree sections
(multiple joins on aliases on a id = parent_id basis) in the query?
What is the implication for tables with millions of records in them
feeding into self-related tables with a few thousand records at up to
15 levels?
Another method that runs syntactically ok but I have no idea what
performace would be like:
SELECT PT.ID, PT.Description
FROM PARTNERS PT
LEFT OUTER JOIN AREAS A1 ON A1.ID = PT.AreaId
LEFT outer join AREAS A2 ON A2.ID = A1.ParentId
LEFT outer join AREAS A3 ON A3.ID = A2.ParentId
LEFT outer join AREAS A4 ON A4.ID = A3.ParentId
LEFT outer join AREAS A5 ON A5.ID = A4.ParentId
LEFT outer join AREAS A6 ON A6.ID = A5.ParentId
LEFT outer join AREAS A7 ON A7.ID = A6.ParentId
LEFT outer join AREAS A8 ON A8.ID = A7.ParentId
WHERE QueryAreaId IN (A1.ID, A2.ID, A3.ID, A4.ID, A5.ID, A6.ID,
A7.ID, A8.ID)
Is there another way of doing this?
Kind regards
Chris Laurie