Subject | RE: [firebird-support] Major query performance problems |
---|---|
Author | Svein Erling Tysvær |
Post date | 2010-12-08T07:52:44Z |
Hi Alec!
Sounds like a plausible plan to me. The optimizer have no histograms and believes PC.IS_MARKED_DELETED = 1 is equally selective as PC.IS_MARKED_DELETED = 0 (from your execution times, I believe that only a small fraction of the records have PC.IS_MARKED_DELETED = 1) and that having to use the index for PC.COPY_ID rather than the presumable C.ID is worth it considering that the index for c.SOURCE_ID is far more selective than the one for PC.IS_MARKED_DELETED. In retrospect, we know this isn't the case, but I think I would have guessed the same as the optimizer did hadn't I known the execution times.
FROM PHYSICAL_COPY PC
JOIN COPY C ON PC.COPY_ID||'' = C.ID
(assuming PC.COPY_ID to be defined as a character field, if it is numeric use +0 rather than ||'')
or
FROM PHYSICAL_COPY PC
LEFT JOIN COPY C ON PC.COPY_ID = C.ID
By the way, I'm surprised by the long time taken to prepare the statement (it IS a very simple query). It might be due to me not being used to Fb 2.5, the database being used heavily or maybe it has more indexes/fields/tables than I'm used to, but have you checked your statistics to see there's not a huge transaction gap?
HTH,
Set
>Statistics on COPY."SOURCE_ID" column index(I modified to using aliases)
>IDX_Lhi+/ZlWWZ/FDOab6YV2Vg== is 0.018519
>Statistics on PHYSICAL_COPY."IS_MARKED_DELETED" column index
>IDX_Lhi+/ZlWWZ/FDOab6YV2Vg== is 0.500000
>
>SELECT FIRST 1000 PC.ID AS COL0
>FROM PHYSICAL_COPY PC
>JOIN COPY C ON PC.COPY_ID = C.ID
>WHERE PC.IS_MARKED_DELETED = 1
> AND C.SOURCE_ID = 'fb50a346-04e7-42d3-8782-1432001c74ad'
>
>PLAN JOIN (C INDEX (IDX_Lhi+/ZlWWZ/FDOab6YV2Vg==), PC INDEX (IDX_AlJS5EmMT9tODQnFqmid0w==))
>Total execution time: 0:30:22 (hh:mm:ss) Script execution finished.
Sounds like a plausible plan to me. The optimizer have no histograms and believes PC.IS_MARKED_DELETED = 1 is equally selective as PC.IS_MARKED_DELETED = 0 (from your execution times, I believe that only a small fraction of the records have PC.IS_MARKED_DELETED = 1) and that having to use the index for PC.COPY_ID rather than the presumable C.ID is worth it considering that the index for c.SOURCE_ID is far more selective than the one for PC.IS_MARKED_DELETED. In retrospect, we know this isn't the case, but I think I would have guessed the same as the optimizer did hadn't I known the execution times.
>NOW THE SAME QUERY WITH ONE FILTER CONDITION:OK, then we know that the first 'one filter plan' is preferable. This can virtually be enforced (well, we're telling the optimizer not to use the index for the PC.COPY_ID field, it may choose other plans that we do not know about) by changing the FROM clause to either:
>WHERE (PHYSICAL_COPY."IS_MARKED_DELETED" = 1)
>PLAN JOIN (PHYSICAL_COPY INDEX (IDX_sKfIZDMkd26ITRo1Da5mzQ==), COPY
>INDEX (PK_ZM6SRonqR8AHSQuCISgvnQ==))
>Total execution time: 3.297s
>
>NOW THE SAME QUERY WITH THE OTHER FILTER CONDITION:
>WHERE (COPY."SOURCE_ID" = 'fb50a346-04e7-42d3-8782-1432001c74ad')
>PLAN JOIN (COPY INDEX (IDX_Lhi+/ZlWWZ/FDOab6YV2Vg==), PHYSICAL_COPY
>INDEX (IDX_AlJS5EmMT9tODQnFqmid0w==))
>Total execution time: 23.624s
FROM PHYSICAL_COPY PC
JOIN COPY C ON PC.COPY_ID||'' = C.ID
(assuming PC.COPY_ID to be defined as a character field, if it is numeric use +0 rather than ||'')
or
FROM PHYSICAL_COPY PC
LEFT JOIN COPY C ON PC.COPY_ID = C.ID
By the way, I'm surprised by the long time taken to prepare the statement (it IS a very simple query). It might be due to me not being used to Fb 2.5, the database being used heavily or maybe it has more indexes/fields/tables than I'm used to, but have you checked your statistics to see there's not a huge transaction gap?
HTH,
Set