Subject RE: [firebird-support] Speed optimization required for DELETE query
Author Svein Erling Tysvær
I cannot answer for Helen's query (I'm not used to max without group by and was surprised that it didn't produce a syntax error), but here's a walkthrough of mine for EmpCode002 and EmpCode003:

Delete from Table_C C
where C.dt_date <= '03/31/2007'

/*Is the date early enough? True or both EmpCode002 and EmpCode003 */

And not exists(
select 1 from Table_A A
where A.var_Empcode = C.var_EmpCode and
(A.var_Status = 'I'
or A.var_Resigned_But_Not_Settled = 'Y')

/*Do they exist with correct values in Table_A? True for both EmpCode002 and EmpCode003 */

and not exists( select 1 from Table_B B where A.var_EmpCode = B.var_EmpCode And A.var_PGCode = B.var_PGCode and B.dt_AppliedDate >= '03/31/2007' ));

/*Do they have a recent AppliedDate? True for EmpCode003, false for EmpCode002*/

So,

EmpCode002: EmpCode003:
True and True and
not (True and not (True and
not (False)) not (True))
=
True and True and
not (True and not (True and
True) False)
=
True and True and
not True not False
=
True False
=
Deleted Not deleted

Set

-----Original Message-----
From: firebird-support@yahoogroups.com [mailto:firebird-support@yahoogroups.com] On Behalf Of Vishal Tiwari
Sent: 10. september 2009 13:40
To: firebird-support@yahoogroups.com
Subject: RE: [firebird-support] Speed optimization required for DELETE query

Hi Helen,

I think the query is not working as per the requirement.

Let me explain the total background.

We have total three table for this DELETE query.

And all the three tableS contain same empcode with more or less number of records.

Say for example we have following employee codes:

"EmpCode001","EmpCode002","EmpCode003","EmpCode004","EmpCode005",
"EmpCode006","EmpCode007","EmpCode008","EmpCode009","EmpCode001".

Now Table_A will contain only 10 records i.e. one record per employee.

Now Table_B (which maintains the history of each employee) contains
total 100 records(say for example) in this table.

Table_C contains total 1000 records(say for example).

One thing is common that all these 10 eployees data is present in all the three tables.

Now, say for example our sub query gives an output as employee codes as follows:

"EmpCode003","EmpCode004","EmpCode005"

Now, our intension is, not to delete these three employees records form the Table_C for the specified period,

i.e. even though these employees records are present for the period i.e. dt_Date <= '03/31/2007', we don't want to delete.

(Exxcept these three employee all employees data shhould be deleted.

After deleetion if we see the records then these three employees data
should be present in the Table_C for the period dt_Date <= '31/03/2009')

Here when we execute the query which you gave, deletes no records.

we were using IN predicate which compares record by record with sub query.

But IN predicate got some limitations with the values, means it must be less than or equal to 1,500 values.

Now what should be the best way to deal with this situation.

Thanks in Advance.

Vishal Tiwari...



--- On Thu, 10/9/09, Vishal Tiwari <vishualsoft@...> wrote:


From: Vishal Tiwari <vishualsoft@...>
Subject: RE: [firebird-support] Speed optimization required for DELETE query
To: firebird-support@yahoogroups.com
Date: Thursday, 10 September, 2009, 2:15 PM






Hi,

Thanks for your replay and suggestion.

But i didn't even get the following sentences.

the date differently treated, in the braces you had mentioned.

Regards

Vishal

--- On Thu, 10/9/09, Svein Erling Tysvær <svein.erling. tysvaer@kreftreg isteret.no> wrote:

From: Svein Erling Tysvær <svein.erling. tysvaer@kreftreg isteret.no>
Subject: RE: [firebird-support] Speed optimization required for DELETE query
To: "firebird-support@ yahoogroups. com" <firebird-support@ yahoogroups. com>
Date: Thursday, 10 September, 2009, 1:48 PM



I think this double negation should be equal to what you and Helen wrote, just (hopefully) noticeably quicker. Note that records dated 31 March 2007 are deleted from Table_C unless they're in Table_B (i.e. '03/31/2007' are treated differently to all other dates, if they should be treated equally to e.g. '03/30/2007' , change '>=' to '>').

Delete from Table_C C
where C.dt_date <= '03/31/2007'
And not exists(
select 1 from Table_A A
where A.var_Empcode = C.var_EmpCode and
(A.var_Status = 'I'
or A.var_Resigned_ But_Not_Settled = 'Y')
and not exists(
select 1 from Table_B B
where A.var_EmpCode = B.var_EmpCode
And A.var_PGCode = B.var_PGCode
and B.dt_AppliedDate >= '03/31/2007' ));

NOT IN (<subselect> ) can be very slow (the subselect is executed one for every potential row in Table_C), and your subselect both contains a GROUP BY that makes it even slower and an redundant DISTINCT. You also want to avoid MAX in a subquery, so I'm not at all surprised your query is slow even though each table only contains a few records.

Indexes on Table_A.var_ Empvode, Table_B.var_ Empcode and Table_B.var_ PGCode are very useful for the query I wrote above if they have decent selectivity, whereas a DESC index on Table_B.dt_AppliedD ate is not required unless the combination var_EmpCode and var_PGCode has poor selectivity.

HTH,
Set

-----Original Message-----
From: firebird-support@ yahoogroups. com [mailto:firebird- support@ yahoogroups. com] On Behalf Of Helen Borrie
Sent: 10. september 2009 07:44
To: firebird-support@ yahoogroups. com
Subject: Re: [firebird-support] Speed optimization required for DELETE query

At 01:07 AM 10/09/2009, vishualsoft wrote:

>Delete from Table_C C
> where
> Extract(Year from C.dt_Date) <= '2007'
> And
> Extract(Month from C.dt_Date) <= '03'
> And
> C.var_EmpCode
> Not In(
> Select Distinct(A.var_ EmpCode)
> from Table_A A Join Table_B B
> On
> A.var_EmpCode = B.var_EmpCode
> And
> A.var_PGCode = B.var_PGCode
> And
> (A.var_Status = 'I'
> or
> A.var_Resigned_ But_Not_Settled = 'Y')
>
> Group By A.var_EmpCode
> Having Max(B.dt_AppliedDat e) < '03/31/2007'
> );
>
> Even if we modifiy sub query as shown below,
> it takes same amount of time.
>

Simplify the thing!

Delete from Table_C C
where
C.dt_date <= '03/01/2007'
And
not exists (
select 1 from Table_A A
join Table_B B
On
A.var_EmpCode = B.var_EmpCode
And
A.var_PGCode = B.var_PGCode
where
A.var_Empcode = C.var_EmpCode and
And
(A.var_Status = 'I'
or
A.var_Resigned_ But_Not_Settled = 'Y') )
and Max(B.dt_AppliedDat e) < '03/31/2007' /* are you sure about this? */
);

A DESC index on B.dt_AppliedDate might be used there, too.

./heLen