Subject Re: [ib-support] Are these statements equivalent
Author Svein Erling Tysvaer
Hi Sean,

I disagree with the others having answered you - I doubt these two
statements are equivalent!

First, I just considered the simple case of hundred.col_key not being
unique, e.g.

updates:
col_key col_decim
1 10000000000
2 1000000000

hundred:
col_key
1
1
2

Your original query would in this case return (leaving out 9 zeroes)
(10+1)/2 = 5.5, whereas your new query (I haven't tried, just assume this
is how it is done) would return (10+10+1)/3 = 7.

But there is more than this to your simplification. Your original query
uses two separate instances of the updates table, meaning that if the
updates table have another row with the same col_key as another entry, but
with a value of col_decim below 980000000, then your original query would
include this in its average, whereas your JOINed query would simply ignore
it! So, unless col_key is unique for both the updates and the hundred
table, your two queries are nowhere near equivalent!

In general, using EXISTS more resembles the IN (subselect) - i.e.

select avg(u1.col_decim)
from updates u1
where
exists(select 1 from hundred
join updates u2 on u2.col_key = hundred.col_key
where u2.col_key = u1.col_key
and u2.col_decim > 980000000)

should be equivalent to your original query (well, if you ignore the fact
that this was written in a hurry and hasn't been verified by anyone).

HTH,
Set

- I support Firebird, I am a FirebirdSQL Foundation member.
- Join today at http://www.firebirdsql.org/ff/foundation

At 16:25 09.02.2003 -0500, you wrote:
>All,
>
>I am reviewing a program which was originally created 10+ years ago.
>Some SQL statements which are executed are performing extremely poorly
>and I am trying to optimize them.
>
>Accordingly, I'd like some "2nd opinions" about the way I am restating
>the statements.
>
>The original statement reads:
>
> select avg(updates.col_decim)
> from updates
> where
> updates.col_key in
> (select updates.col_key
> from updates, hundred
> where hundred.col_key = updates.col_key
> and updates.col_decim > 980000000)
>
>My new statement reads:
>
> select avg(updates.col_decim)
> from updates
> join hundred ON hundred.col_key = updates.col_key
> where updates.col_decim > 980000000
>
>
>I can tell you that they both provide the same answer, for the database
>I'm working with.
>
>Does anyone think that the two statements ARE NOT equivalent?