firebird-support - Re: [firebird-support] sql question

Subject	Re: [firebird-support] sql question
Author	Svein Erling Tysvaer
Post date	2006-09-05T11:47:52Z

Hi again Maurizio!

Yes, my solution should work regardless of how many rows are identical,
all rows just have to exist for both IDs.

Let's take some examples:

ID F1 F2 F3
1 A B C

2 A B C

3 A B C
3 D E F

4 A B C
4 D E F

5 A B C
5 D E F
5 G H I

6 A B C
6 D E F

My suggested query would (hopefully, still not tested) return

1 2
3 4
3 6
4 6

since these are the identical tuples. All other comparisons would fail,
since there is minimum one row that is missing. E.g. 4 has a row that is
not in 1 (failing the M5/M6 subselect) and 5 has a row that is not in 6
(failing the M3/M4 subselect).

It doesn't matter how many rows match or not match (it could be hundreds
of rows, but must be minimum 1), there just has to be a match somewhere.

Though suppose you had the following situation:

3 A B C
3 A B C
3 D E F

4 A B C
4 D E F

that would match even though the number of rows are different. Hence, my
solution cannot be used if two identical rows are supposed to be
considered different (probably a rare case that has to be solved with a
"subselect count" or redesign of your database, just thought I should
mention it).

HTH,
Set

Maurizio P. wrote:

> thanks for your answer ,
> i could not make a test now ,
>
> ...just a dubt i have :
> is it valid for a different rows number too ?
> in my case the number of the row is variable .
>
> Maurizio
>
>
> ----- Original Message -----
> From: Svein Erling Tysvaer
> To: Firebird-support
> Sent: Tuesday, September 05, 2006 9:03 AM
> Subject: Re: [firebird-support] sql question
>
>
> Hi again, now I'm awake and my suggested solution needs some minor
> corrections:
>
> select distinct M1.ID, M2.ID
> /*Distinct to avoid duplicates*/
> from MaurizioTable M1
> join MaurizioTable M2 on M2.F1 = M1.F1
> and M2.F2 = M1.F2
> and M2.F3 = M1.F3
> and M2.ID+0 > M1.ID
> /*Clause added to avoid lots of cases where M1.ID = M2.ID*/
> /*I also expect one row (10, 16) to be enough*/
> /*and that you do not need (16, 10) as well*/
> /*+0 added due to expected speed improvement*/
> where not exists(select * from MaurizioTable M3
> where M3.ID = M1.ID
> where not exists(select * from MaurizioTable M4
> where M4.ID = M2.ID
> and M4.F1 = M3.F1
> and M4.F2 = M3.F2
> and M4.F3 = M3.F3))
> /*There should be no M1.ID that does not have a duplicate in M2*/
> and not exists(select * from MaurizioTable M5
> where M5.ID = M2.ID
> where not exists(select * from MaurizioTable M6
> where M6.ID = M1.ID
> and M6.F1 = M5.F1
> and M6.F2 = M5.F2
> and M6.F3 = M5.F3))
> /*There should be no M2.ID that does not have a duplicate in M1*/
>
> Though I still haven't tried parsing this and suspect that the size of
> the table as well as index selectivity may or may not make this a viable
> solution (it will be slow on a table with millions of rows)...
>
> Set