Subject Re: Questions about SELECT DISTINCT *
Author Svein Erling Tysvær
--- In firebird-support@yahoogroups.com, Svein Erling Tysvær wrote:
> --- In firebird-support@yahoogroups.com, "robert_difalco" wrote:
> > --- In firebird-support@yahoogroups.com, Svein E. Tysvær wrote:
> > > >
> > > > SELECT * FROM Table A WHERE A.uid IN
> > > > SELECT DISTINCT uid FROM Table [JOINS][WHERE];
> > >
> > > Yup, my first thought was "What a lousy idea, this won't even
> > > get him the right result"! If RowA and RowB have the same UID,
> > > then both will match the subselect and hence be returned. I
> > > cannot even think of a case where the DISTINCT word of 'IN
> > > (SELECT DISTINCT...)' makes any difference.
> >
> > I think you are missing the JOINS. Think of a JOIN say on a Links
> > table that creates a tree structure. Believe it or not, the query
> > above performs more efficiently that without the subquery because
> > of the DISTINCT.
> >
> > R.
>
> Maybe there are things that I don't see, but how does your query
> differ from the same query without DISTINCT in the subselect? In my
> opinion, the only possible difference between
>
> SELECT * FROM TableA WHERE A.uid IN
> (SELECT DISTINCT uid FROM TableA [JOINS][WHERE]);
>
> and
>
> SELECT * FROM TableA WHERE A.uid IN
> (SELECT uid FROM TableA [JOINS][WHERE]);
>
> is that the former may be slower than the latter (depending on how
> the optimizer treats DISTINCT is such cases, it could simply ignore
> the word). The results of these two queries should be identical
> anyway (regardless of any JOIN).
>
> Set

Writing yet another mail just to make my point even clearer:

SELECT A.uid FROM TableA a WHERE A.uid IN
(SELECT DISTINCT b.uid FROM TableA b [JOINS][WHERE]);

is identical to

SELECT A.uid FROM TableA a WHERE A.uid IN
(SELECT b.uid FROM TableA b [JOINS][WHERE]);

Both of these differ (produces a different result set) from

SELECT DISTINCT A.uid FROM TableA a WHERE A.uid IN
(SELECT b.uid FROM TableA b [JOINS][WHERE]);

The first two queries return duplicates in case of two records with
the same value in TableA.uid, the last query only returns one record.

Set