Subject Re: [firebird-support] sql question
Author Svein Erling Tysvaer
Hi again Maurizio!

Yes, my solution should work regardless of how many rows are identical,
all rows just have to exist for both IDs.

Let's take some examples:

ID F1 F2 F3
1 A B C

2 A B C

3 A B C
3 D E F

4 A B C
4 D E F

5 A B C
5 D E F
5 G H I

6 A B C
6 D E F

My suggested query would (hopefully, still not tested) return

1 2
3 4
3 6
4 6

since these are the identical tuples. All other comparisons would fail,
since there is minimum one row that is missing. E.g. 4 has a row that is
not in 1 (failing the M5/M6 subselect) and 5 has a row that is not in 6
(failing the M3/M4 subselect).

It doesn't matter how many rows match or not match (it could be hundreds
of rows, but must be minimum 1), there just has to be a match somewhere.

Though suppose you had the following situation:

3 A B C
3 A B C
3 D E F

4 A B C
4 D E F

that would match even though the number of rows are different. Hence, my
solution cannot be used if two identical rows are supposed to be
considered different (probably a rare case that has to be solved with a
"subselect count" or redesign of your database, just thought I should
mention it).

HTH,
Set

Maurizio P. wrote:
> thanks for your answer ,
> i could not make a test now ,
>
> ...just a dubt i have :
> is it valid for a different rows number too ?
> in my case the number of the row is variable .
>
> Maurizio
>
>
> ----- Original Message -----
> From: Svein Erling Tysvaer
> To: Firebird-support
> Sent: Tuesday, September 05, 2006 9:03 AM
> Subject: Re: [firebird-support] sql question
>
>
> Hi again, now I'm awake and my suggested solution needs some minor
> corrections:
>
> select distinct M1.ID, M2.ID
> /*Distinct to avoid duplicates*/
> from MaurizioTable M1
> join MaurizioTable M2 on M2.F1 = M1.F1
> and M2.F2 = M1.F2
> and M2.F3 = M1.F3
> and M2.ID+0 > M1.ID
> /*Clause added to avoid lots of cases where M1.ID = M2.ID*/
> /*I also expect one row (10, 16) to be enough*/
> /*and that you do not need (16, 10) as well*/
> /*+0 added due to expected speed improvement*/
> where not exists(select * from MaurizioTable M3
> where M3.ID = M1.ID
> where not exists(select * from MaurizioTable M4
> where M4.ID = M2.ID
> and M4.F1 = M3.F1
> and M4.F2 = M3.F2
> and M4.F3 = M3.F3))
> /*There should be no M1.ID that does not have a duplicate in M2*/
> and not exists(select * from MaurizioTable M5
> where M5.ID = M2.ID
> where not exists(select * from MaurizioTable M6
> where M6.ID = M1.ID
> and M6.F1 = M5.F1
> and M6.F2 = M5.F2
> and M6.F3 = M5.F3))
> /*There should be no M2.ID that does not have a duplicate in M1*/
>
> Though I still haven't tried parsing this and suspect that the size of
> the table as well as index selectivity may or may not make this a viable
> solution (it will be slow on a table with millions of rows)...
>
> Set