Subject Re: [firebird-support] Duplicate record deletion
Author Ann W. Harrison
Ian Newbury wrote:
>>
>>the subselect in the following statement returns 3008 records out of
>>the total of 3068. However the full statement does nothing.
>>
>>delete from ppm_mappings m2
>> where m2.rdb$db_key not in
>> (select min(m1.rdb$db_key) from ppm_mappings m1
>> group by m1.matrix_id, m1.product_id, m1.process_id)
>>

Interesting. There a may be something a bit odd about the combination
of IN and db_key ... The table definition is

create table t1 (f1 integer);

The table contains six records, all with f1 = 1

This query returns no records where it should return 5:

SQL> select x.f1 from t1 x where x.rdb$db_key not in
CON> (select min(y.rdb$db_key) from t1 y
CON> group by y.f1);

This query returns one record, as it should:

SQL> select x.f1 from t1 x where x.rdb$db_key =
CON> (select min(y.rdb$db_key) from t1 y
CON> group by y.f1);

F1
============

1
>

But you may just be over thinking the problem.

Try this:

delete from xxx x
where x.rdb$db_key <>
select min (y.rdb$db_key) from xxx y
where y.field1 = x.field1 and
y.field2 = x.field2 and
y.field3 = x.field3 and ...


Arno Brinkman wrote:
>
> I don't think using the DB_KEY is a good idea.

Just out of curiosity, why not?

> Can't you solve it another way? Like:
>
> DELETE FROM
> ppm_mappings m2
> WHERE
> NOT SINGULAR(SELECT 1 FROM ppm_mappings m1 WHERE
> m1.matrix_id = m2.matrix_id and
> m1.product_id = m2.product_id and
> m1.process_id = m2.process_id)
>

If I understand the standard correctly, that should delete all records
that have duplicates when the query starts. It won't, given the current
state of Firebird, because we incorrectly execute the subselect for each
iteration of the delete, so the last record with each set of values
fails the "NOT SINGULAR" test.

Regards


Ann