Subject Re: Help needed optimizing query
Author Svein Erling
--- In firebird-support@yahoogroups.com, Harriv <harriv@...> wrote:
> Hi,
>
> I did small experiment storing hierarchical data, and now I'd like > to optimize my query when retrieving tree.
>
> I'm using the Modified Preorder Tree Traversal described here:
> http://articles.sitepoint.com/article/hierarchical-data-database/2
>
> My query is:
>
> SELECT
> *
> FROM
> TEXT_DATA T
> WHERE
> T.POSITION_TAG IN (SELECT
> PSM.POSITION_TAG
> FROM
> POSITION_SCREEN_MAP PSM
> WHERE
> PSM.LFT BETWEEN (SELECT
> POSITION_SCREEN_MAP.LFT
> FROM
> POSITION_SCREEN_MAP
> WHERE
> POSITION_SCREEN_MAP.POSITION_TAG = 'AREA1') AND (SELECT
> POSITION_SCREEN_MAP.RGHT
> FROM
> POSITION_SCREEN_MAP
> WHERE
> POSITION_SCREEN_MAP.POSITION_TAG = 'AREA1')
> )
>
> If I've understood correctly, IN operator isn't very optimizable, is
> there way to avoid using it?

Hi Harriv!

IN (<constants>) is generally OK, IN (<subselect>) is not always OK and I virtually always replace such queries with EXISTS.

Looking at the interesting Modified Preorder Tree Traversal, I'd say your query can even be written simply using JOINs:

SELECT T.*
FROM POSITION_SCREEN_MAP PSM
JOIN POSITION_SCREEN_MAP PSM2
ON PSM2.LFT BETWEEN PSM.LFT AND PSM.RGHT
JOIN TEXT_DATA T
ON T.POSITION_TAG = PSM2.POSITION_TAG
WHERE PSM.POSITION_TAG = 'AREA1'

Although there are some clear advantages in the ability of this method to quickly calculate the number of child rows and probably a few other benefits as well, I'd say it is mainly useful for read-only databases or databases you update rarely with only batch updates. It has a huge cost of updating lots of rows whenever you insert a new value to the tree (choose between updating on average half the rows in the table for a single insert or do the inserts and then recalculate the entire table!) that goes against the very idea of a multiuser environment. Firebirds beautiful multiuser abilities are gone when you use this method.

Moreover, with Firebirds WITH RECURSIVE capability, you only need one database query for the entire tree in the adjacency list model, so it should be a lot quicker than doing things the way the article writer assumes. I must admit I have no experience with large trees (and Firebird does have a depth limit of recursion), but for situations where multiple users can add things to anywhere in the tree at any one time, I'd definitely recommend the adjacency list model.

HTH,
Set