Subject RE: [firebird-support] Speed optimization required for DELETE query
Author Svein Erling Tysvær
I think this double negation should be equal to what you and Helen wrote, just (hopefully) noticeably quicker. Note that records dated 31 March 2007 are deleted from Table_C unless they're in Table_B (i.e. '03/31/2007' are treated differently to all other dates, if they should be treated equally to e.g. '03/30/2007', change '>=' to '>').

Delete from Table_C C
where C.dt_date <= '03/31/2007'
And not exists(
select 1 from Table_A A
where A.var_Empcode = C.var_EmpCode and
(A.var_Status = 'I'
or A.var_Resigned_But_Not_Settled = 'Y')
and not exists(
select 1 from Table_B B
where A.var_EmpCode = B.var_EmpCode
And A.var_PGCode = B.var_PGCode
and B.dt_AppliedDate >= '03/31/2007'));

NOT IN (<subselect>) can be very slow (the subselect is executed one for every potential row in Table_C), and your subselect both contains a GROUP BY that makes it even slower and an redundant DISTINCT. You also want to avoid MAX in a subquery, so I'm not at all surprised your query is slow even though each table only contains a few records.

Indexes on Table_A.var_Empvode, Table_B.var_Empcode and Table_B.var_PGCode are very useful for the query I wrote above if they have decent selectivity, whereas a DESC index on Table_B.dt_AppliedDate is not required unless the combination var_EmpCode and var_PGCode has poor selectivity.

HTH,
Set

-----Original Message-----
From: firebird-support@yahoogroups.com [mailto:firebird-support@yahoogroups.com] On Behalf Of Helen Borrie
Sent: 10. september 2009 07:44
To: firebird-support@yahoogroups.com
Subject: Re: [firebird-support] Speed optimization required for DELETE query

At 01:07 AM 10/09/2009, vishualsoft wrote:

>Delete from Table_C C
> where
> Extract(Year from C.dt_Date) <= '2007'
> And
> Extract(Month from C.dt_Date) <= '03'
> And
> C.var_EmpCode
> Not In(
> Select Distinct(A.var_EmpCode)
> from Table_A A Join Table_B B
> On
> A.var_EmpCode = B.var_EmpCode
> And
> A.var_PGCode = B.var_PGCode
> And
> (A.var_Status = 'I'
> or
> A.var_Resigned_But_Not_Settled = 'Y')
>
> Group By A.var_EmpCode
> Having Max(B.dt_AppliedDate) < '03/31/2007'
> );
>
> Even if we modifiy sub query as shown below,
> it takes same amount of time.
>

Simplify the thing!

Delete from Table_C C
where
C.dt_date <= '03/01/2007'
And
not exists (
select 1 from Table_A A
join Table_B B
On
A.var_EmpCode = B.var_EmpCode
And
A.var_PGCode = B.var_PGCode
where
A.var_Empcode = C.var_EmpCode and
And
(A.var_Status = 'I'
or
A.var_Resigned_But_Not_Settled = 'Y') )
and Max(B.dt_AppliedDate) < '03/31/2007' /* are you sure about this? */
);

A DESC index on B.dt_AppliedDate might be used there, too.

./heLen