Subject Re: alternative to an IN clause...
Author Adam
--- In, "Greg At ACD"
<GregAtACD@h...> wrote:
> Hi all,
> Background:
> I have a Firebird 1.5.2 database that I am accessing using Visual
> C++ (.NET 2003) and using SQLAPI for DB access
> (
> Our application needs to make a call similar to the following:
> SELECT * from myTable
> WHERE ID IN (1,2,3...,n)
> and the number of values that will be in the IN statement will vary
> anywhere from 1 to 1000 or more (theoretically).
> From my client application, one option is to simply generate the
> statement above (in reality the statement is much more complicated
> than this, but this post specifically deals with the IN clause) by
> formatting a string containing a comma separated list of values
> (they are currently in a simple long array of numbers).
> I'm wondering if there is a more efficient (or a 'best practices')
> way of doing this? Perhaps dumping the values in a temporary table
> and doing an inner join to this? I would think that the overhead of
> doing 1000 insert statements before executing the query may not be
> very efficient. Perhaps using a selectable stored procedure that
> takes the comma-separated list of numbers as input, and do a join
> this virtual table?
> Any help here is certainly appreciated... i can't imagine that this
> is a new problem.

There are a lot of performance issues to consider. The insert into
temporary table can make sense in some cases, Especially if you are
re-using this list across multiple queries, but remember you need to
clean up after yourself, and you will want to isolate the list to
your instance of the application, maybe even your transacction only.

You may not be able to insert much over 1000 records per second, and
if the csv gets big enough, this might come into play.

You can have at most 1499 members inside an IN clause. This is a
restriction inside Firebird.
Each member of the IN clause is a separate index lookup, so put 1000
records inside the IN, and you will hit the index 1000 times.

The between logic eg (where ID between 1 and 1000) or alternatively
(where ID>=1 and ID<=1000) uses a single index lookup, so it is more

Our solution to this problem was a function that converts a dataset
into the appropriate clause for the query. It handles duplicates, and
automatically generates an optimal bunch of between statements and
then a bunch of IN statements to handle those singular values.

It is written in Delphi, but I am sure it has conceptual equivalents
in C++.Net