Subject Re: [firebird-support] Re: how do i speed this up
Author Svein Erling Tysvaer
Hi again!

Your procedure is similar to what I suggested, but probably a bit
slower. At first glance, I do not see how it could return NULL, but I
haven't looked closely since this may be gone when you clean up your
procedure and if not, then it could possibly be easily circumvented by
adding WHERE IDOUT IS NOT NULL to your calling query.

You seem to prefix variables with a colon everywhere. I think that
should only be done in the sql calls. Moreover, SET TERM is part of the
stored procedure language, so I would try to avoid having TERM as the
name of a parameter. You combine SELECT FIRST with counting manually,
and that is just wasting time and resources. And finally, is ASTERM at
all indexed (if it had been ISO8859_1, I don't think a VARCHAR(240)
could be indexed)? If not, add an indexed shadow column populated
through triggers that stores the first 80 characters (or thereabout) and
change to something like:

CREATE PROCEDURE GET_NEXT_20_DICENTRIES_BY_ASTER (
AlmostTerm varchar(240),
idin bigint)
returns (
pos integer,
idout bigint)
as
declare variable currentasterm varchar(240);
declare variable priorasterm varchar(240);
declare variable shortasterm varchar(80);
declare variable currentid bigint;
declare variable cnt integer = 1;
begin
pos = 0;
priorasterm = AlmostTerm;
currentid = idin;
while (1=1) do
begin
shortasterm = substring(priorasterm from 1 for 80);
select min(indexedasterm) from dicentries
where ((asterm = :priorasterm and id >= :currentid)
or (asterm > :priorasterm))
and (indexedasterm >= :shortpriorasterm) into :currentasterm;
if (currentasterm is null) then
exit;
if (currentasterm > priorasterm) then
currentid = 0;
pos = cnt;
shortasterm = substring(currentasterm from 1 for 80);
for select id from dicentries
where asterm = :currentasterm and id >= :currentid
and indexedasterm = :currentasterm
order by id ascending into :idout do
begin
suspend;
if (cnt = 19) then
exit;
end
priorasterm = currentasterm;
currentid = idout + 1;
cnt = cnt + 1;
end
end

HTH,
Set

martinknappe wrote:
> Hi Sven,
> yours seems to me like a very good idea; sometimes, it seems, I'm not
> seeing the wood for trees ;)
> Now, I've modified your procedure so that it compiles and it
> *actually* works and looks semantically correct to me, but as it
> happens, in some cases, I get a result set with some of the result
> entries having idout = null (some, not all; and not even at the end of
> the result set, so it seems it's not to do with the exit
> statement)..this seems to be a very subtle bug..does any one see whats
> the problem; i've been looking over it over and over again for the
> last couple hours and don't see what's the problem:
>
> CREATE PROCEDURE GET_NEXT_20_DICENTRIES_BY_ASTER (
> "TERM" varchar(240),
> idin bigint)
> returns (
> pos integer,
> idout bigint)
> as
> declare variable currentasterm varchar(240);
> declare variable priorasterm varchar(240);
> declare variable currentid bigint;
> declare variable cnt integer = 1;
> begin
> pos = 0;
> priorasterm = :term;
> currentid = :idin;
> while (1=1) do
> begin
> select min(asterm) from dicentries where (asterm = :priorasterm
> and id >= :currentid) or (asterm > :priorasterm) into :currentasterm;
> if (:currentasterm is null) then
> exit;
> if (:currentasterm > :priorasterm) then
> currentid = 0;
> pos = :cnt;
> select first 1 id from dicentries where (asterm = :currentasterm
> and id >= :currentid) order by id ascending into :idout;
> suspend;
> if (:cnt = 19) then
> exit;
> priorasterm = :currentasterm;
> currentid = :idout + 1;
> cnt = :cnt + 1;
> end
> end
>
> thanx a lot,
>
> martin
>
> --- In firebird-support@yahoogroups.com, Svein Erling Tysvaer
> <svein.erling.tysvaer@...> wrote:
>> OK then, Martin.
>>
>> Try something like
>>
>> CREATE PROCEDURE get_next_20_dicentries_by_asterm(
>> astermparam varchar(80); idparam integer) returning idreturn integer
>> as
>> declare variable currentasterm varchar(80);
>> declare variable currentid integer;
>> declare variable count integer;
>> begin
>> count = 0;
>> select min(asterm) from dicentries
>> where asterm > :astermparam
>> or (asterm = :astermparam
>> and id > :idparam) into :currentasterm;
>> if (currentasterm > astermparam) then
>> currentid = 0 //This ascertains that id > 0
>> else
>> currentid = idparam;
>> while (count < 20) do
>> begin
>> for select id from dicentries
>> where (asterm = :currentasterm
>> and id > :currentid)
>> order by id
>> into :idreturn do
>> begin
>> count = count+1;
>> if (count < 20) then
>> suspend
>> else
>> exit;
>> end
>> select min(asterm) from dicentries
>> where asterm > :currentasterm
>> into :currentasterm;
>> currentid = 0;
>> end
>> end
>>
>> Now, this stored procedure is written in Thunderbird and will not work
>> if you try to prepare it (I've written a very limited number of stored
>> procedures in my life). So your first task will be to remove the syntax
>> errors.
>>
>> Once that is done, with two indexes - one on asterm and hopefully a
>> primary key on id, my guess is that it could well help you out of your
>> performance problems. That is, unless you have lots of duplicate asterm
>> values (like 80 percent share five values).
>>
>> Now, I never understood what you meant by GetCollateKey, but I hope my
>> suggestion still helps a bit?
>>
>> Tell us the results, I'm curious. If it still is a problem, please
>> prepare all statements individually and report back the plan (and tell
>> us how selective asterm is).
>>
>> HTH,
>> Set
>>
>> martinknappe wrote:
>>> I have an idea how I could make my select procedure more selective but
>>> I don't know whether it's technically possible with firebird; if any
>>> of you knows, please tell me. The idea is to rewrite this procedure
>>> but also this would make it necessary to obtain the collation key of a
>>> given varchar object inside the procedure (ps
>> eudo-code):
>>> variables: i = 0 (integer), key (varchar(80)), set1 (table), set2
> (table)
>>> 1) the procedure would accept asterm, id as parameters
>>> 2) it would then get the collation key of asterm via a function call
>>> like for example (you please tell me whether any such thing exists):
>>> key = GetCollateKey(asterm); key being a one-byte-character string
>>> 3) it would then "select id from dicentries where
>>> GetCollateKey(asterm) = key" into a result set (set1)
>>> 4) next, it would "select id from dicentries d inner join set1 s on
>>> s.id = d.id" into result set (set2)
>>> 5) then it would do something like this:
>>> "for select id from set2 order by id do
>>> begin
>>> suspend;
>>> i = i + 1;
>>> if i = 20 then goto 8;
>>> end
>>> 6) key = key + 1
>>> 7) goto 3
>>> 8) end of procedure