Subject | Re: Ideas anyone? |
---|---|
Author | Adam |
Post date | 2005-09-18T23:47:28Z |
Tom,
Here are the results of three basic tests I ran this morning. If you
have any specific questions relating to the tests, probably best to
contact me by email because it the suggestion was arguably OT.
Test 1
Added all numbers 0 to 10000 and asked function to generate
appropriate statement,
561 ms
Test 2
Added 10000 random numbers between 1 and 10000, there were probably
duplicates.
580 ms
Test 3
Added 100 random numbers between 1 and 100, again there were probably
duplicates.
<10 ms
Test 4
Added 10000 sequential numbers between 1 and 10000, then removed 100
random numbers (this is the most realistic scenario)
560 ms
For reference, the old function just cycled through and put a comma
between each entry in the dataset and returned the string. I repeated
test 4 with the old function, and the result was about 300 ms.
These tests were carried out on my laptop unplugged and running in
max battery mode. Throw some faster hardware at it and it will go
even faster (it is all memory and CPU work).
Considering it reduced the number of index hits from 10000 down to
about 100 for test 4. It is all academic though, because you couldn't
use the old version because it has more than 1499 members.
Adam
Here are the results of three basic tests I ran this morning. If you
have any specific questions relating to the tests, probably best to
contact me by email because it the suggestion was arguably OT.
Test 1
Added all numbers 0 to 10000 and asked function to generate
appropriate statement,
561 ms
Test 2
Added 10000 random numbers between 1 and 10000, there were probably
duplicates.
580 ms
Test 3
Added 100 random numbers between 1 and 100, again there were probably
duplicates.
<10 ms
Test 4
Added 10000 sequential numbers between 1 and 10000, then removed 100
random numbers (this is the most realistic scenario)
560 ms
For reference, the old function just cycled through and put a comma
between each entry in the dataset and returned the string. I repeated
test 4 with the old function, and the result was about 300 ms.
These tests were carried out on my laptop unplugged and running in
max battery mode. Throw some faster hardware at it and it will go
even faster (it is all memory and CPU work).
Considering it reduced the number of index hits from 10000 down to
about 100 for test 4. It is all academic though, because you couldn't
use the old version because it has more than 1499 members.
Adam
> >chain
> > > This is how we solved this problem...
> > > We have a function along the lines
> > > DatasetToConditionClause(ds : TDataset; ...
> >
> > Q's
> > - Are you very happy with your solution?
>
> Yes, it is sort of a best of both worlds, the simplicity of the "IN"
> approach but the performance of a better con
>
> > - level of performance achieved?
>
> We did not do it primarily for performance reasons but rather as a
> work around for the 1499 limit. One of the slowest links in the
> will be network comms, and let me assure you that the solutionat a
> generates a smaller SQL statement for any real input.
>
> If you had 1000 members in the IN predicate, then you are looking
> statement that is at least a few KB larger than the equivalent setof
> between statements.development
>
> I will do some measurements when I am next in front of my
> system.Again,
>
> > - aprox. amount of data being queried?
> >
>
> It varies from one or two records to over ten thousand records.
> we would never bothered such an elaborate solution if it wasn't used
> over 50 times within our application in various capacities.
>
> Adam