Subject RE: [firebird-support] Major query performance problems
Author Svein Erling Tysvær
Hi Alec!

>Statistics on COPY."SOURCE_ID" column index
>IDX_Lhi+/ZlWWZ/FDOab6YV2Vg== is 0.018519
>Statistics on PHYSICAL_COPY."IS_MARKED_DELETED" column index
>IDX_Lhi+/ZlWWZ/FDOab6YV2Vg== is 0.500000
>
>SELECT FIRST 1000 PC.ID AS COL0
>FROM PHYSICAL_COPY PC
>JOIN COPY C ON PC.COPY_ID = C.ID
>WHERE PC.IS_MARKED_DELETED = 1
> AND C.SOURCE_ID = 'fb50a346-04e7-42d3-8782-1432001c74ad'
>
>PLAN JOIN (C INDEX (IDX_Lhi+/ZlWWZ/FDOab6YV2Vg==), PC INDEX (IDX_AlJS5EmMT9tODQnFqmid0w==))
>Total execution time: 0:30:22 (hh:mm:ss) Script execution finished.

(I modified to using aliases)

Sounds like a plausible plan to me. The optimizer have no histograms and believes PC.IS_MARKED_DELETED = 1 is equally selective as PC.IS_MARKED_DELETED = 0 (from your execution times, I believe that only a small fraction of the records have PC.IS_MARKED_DELETED = 1) and that having to use the index for PC.COPY_ID rather than the presumable C.ID is worth it considering that the index for c.SOURCE_ID is far more selective than the one for PC.IS_MARKED_DELETED. In retrospect, we know this isn't the case, but I think I would have guessed the same as the optimizer did hadn't I known the execution times.

>NOW THE SAME QUERY WITH ONE FILTER CONDITION:
>WHERE (PHYSICAL_COPY."IS_MARKED_DELETED" = 1)
>PLAN JOIN (PHYSICAL_COPY INDEX (IDX_sKfIZDMkd26ITRo1Da5mzQ==), COPY
>INDEX (PK_ZM6SRonqR8AHSQuCISgvnQ==))
>Total execution time: 3.297s
>
>NOW THE SAME QUERY WITH THE OTHER FILTER CONDITION:
>WHERE (COPY."SOURCE_ID" = 'fb50a346-04e7-42d3-8782-1432001c74ad')
>PLAN JOIN (COPY INDEX (IDX_Lhi+/ZlWWZ/FDOab6YV2Vg==), PHYSICAL_COPY
>INDEX (IDX_AlJS5EmMT9tODQnFqmid0w==))
>Total execution time: 23.624s

OK, then we know that the first 'one filter plan' is preferable. This can virtually be enforced (well, we're telling the optimizer not to use the index for the PC.COPY_ID field, it may choose other plans that we do not know about) by changing the FROM clause to either:

FROM PHYSICAL_COPY PC
JOIN COPY C ON PC.COPY_ID||'' = C.ID

(assuming PC.COPY_ID to be defined as a character field, if it is numeric use +0 rather than ||'')

or

FROM PHYSICAL_COPY PC
LEFT JOIN COPY C ON PC.COPY_ID = C.ID

By the way, I'm surprised by the long time taken to prepare the statement (it IS a very simple query). It might be due to me not being used to Fb 2.5, the database being used heavily or maybe it has more indexes/fields/tables than I'm used to, but have you checked your statistics to see there's not a huge transaction gap?

HTH,
Set