Subject | Re: Facing Problem with limitations of IN CLAUSE |
---|---|
Author | Adam |
Post date | 2006-03-12T01:12:23Z |
> As per my knowledge I know that IN CLAUSE have limitations of only 1499Also FB 1.5 and earlier
> elements. (firebird v2.0)
>
> I have one query where IN CLAUSE elements exceeds above 4000 thereforeAssuming the field is indexed, there would be an indexed read for each
> firebird gives error...
>
> I tried entering the elements in temp. table and then using the
>
> SELECT ........ from TABLENAME where ...... in (Select .. from
> TEMPTABLE)
>
> It works, but quiet slow because I have to first insert the elements in
> TEMP TABLE and
> then fire the main query as shown above.
>
> I Also tried using combination of IN and OR Operator as
>
> select ... from tablename where .. in (1,2,3,4,5,6....,1499) or ... in
> (1500.......) or ... in(.......)
>
> But this works very very slow taking around 8 to 10 minutes and
> sometimes even more, so not very effective query
member of an IN statement. In other words, you would hit the index
4000 times, then check the data pages. By the look of your query, you
are probably selecting most of the records anyway, so a natural scan
would be quicker. (Indexed reads only help when a significant number
of data pages can be skipped).
>Lets split this into two issues, the 1499 issue and the speed issue.
> Some one suggested me to break the query in two or more
> like first 1499 elements in one query and then fire and get the records
> and then
> next 1499 elements and get the records and so .. on
>
> but the problem is that I want to order by clause and with this approach
> it is not possible to sort the data....
> and secondly If query contains aggregate functions like SUM, MAX, MIN
> ..... in that case
> firing query with first 1499 elements and so..on will give me wrong
> results
>
> SO, Is there any way to overcome this limitations with out creating any
> TEMP TABLE
The 1499 issue is an implementation limitation, IIRC it is the amount
of scratch space it has to work with, so it may not be something so
quickly overcome in a future version.
We have a similar requirement to yours. We have a filter that the user
ticks and unticks potentially thousands of records, and in larger
databases this 1499 limit can be reached. In fact, if you start doing
the 'WHERE FIELD IN (1,2,...,1499) OR FIELD IN (.....) trick, the next
limit you will hit is the maximum query size of 64K. (probably around
10K-20K members).
We use Delphi, and had a DatasetToCSV function that took a TDataset
(or descendant), cycled through each record and produced the
appropriate (1,2,3,4, etc) list. I assume you are doing something
similar. The solution was to do something similar.
The revised function sees the input dataset was first sorted. The
first record was then noted, and the sorted dataset is then read
through until the first 'break' (number that is not there). If there
are more than two records between them, then the range is converted to
a between. Otherwise the numbers are placed into a misc holding
clientdataset. This continues until every one of the sorted records
are in either a between range or the misc table. The between ranges
are then OR'ed together in the resulting function, and the misc
dataset is broken down into IN clause(s) with maximum 1499 records.
Example
TableA, ID
1,2,3,4,6,7,8,10,11,12,13,15
The 'old way':
select *
from TableA
where ID in (1,2,3,4,6,7,8,10,11,12,13,15)
A better way:
select *
from TableA
where (
(ID >= 1 and ID <= 4) OR
(ID >= 6 and ID <= 8) OR
(ID >= 10 and ID <= 13) OR
(ID IN (15))
)
OK, it doesn't look better in this particular case, but it hits the
index 8 times fewer than the first query and when large sections of
records are returned you are less likely to encounter problems. Also
the additional carriage returns are just for readability.
If you took your example, it would produce
select *
from tablename
where ((ID >= 1 and ID <= 4499))
A single hit to the index (although in this case I would probably tell
the function to use the field ID+0 so it returns the following because
there is no point using an index to retrieve all records in the table)
select *
from tablename
where ((ID+0 >= 1 and ID+0 <= 4499))
Adam