firebird-support - Re: [firebird-support] First word after in alphabetical order

Subject	Re: [firebird-support] First word after in alphabetical order
Author	Helen Borrie
Post date	2005-08-28T02:40:41Z

At 08:51 PM 26/08/2005 +0000, you wrote:

>Hi,
>
>I have a table with a lot of names in a colomn. Giving the beginning
>of a word, I want to be able to go to the corresponding area in the
>table and to have an overview of the area. More precisely, I'm
>looking for FISHE. I want to retrieve the first record in
>alphabetical after FISHE, inclunding FISHE itself if it exists. I
>also want the ten next and ten previous records in alphabetical
>order. Therefore, I tried :
>- for the first and ten next :
>SELECT FIRST 11
>ID_NAME
>FROM NAMES
>WHERE NAME>='FISHE'
>ORDER BY NAME;
>
>-for the ten previous :
>SELECT FIRST 10
>ID_NAME
>FROM NAMES
>WHERE NAME<'FISHE'
>ORDER BY NAME DESC;
>
>But this is very slow. 1m45s each on my server (FB1.5.2 /Linux).
>
>In opposite :
>SELECT FIRST 11
>ID_NAME
>FROM NAMES
>WHERE NAME LIKE 'FISHE%'
>ORDER BY NAME;
>
>only need 1s (but don't give the right result if less than 11 names
>are starting with FISHE).
>
>Any idea to solve my problem (the global one, not just improving my
>requests)?

This is a heavy requirement - SELECT FIRST sets are unavoidably
expensive. As you observed, your second approach doesn't get you what you
want.

If you don't have *both* ascending and descending indexes on the NAME
column, you should. If you do, and it still isn't fast enough, then I
would recommend using a SELECT stored procedure to get this set, to reduce
the amount of table-walking that has to be done. NAME should still be
indexed both ways.

As I understand it, you want N + 1 + N rows: the first row where the NAME
starts with 'ASTRING' and the N rows above and below that row in the
ordered set. I assume ID_NAME is the primary key (or some other unique
constraint) of type BigInt and that NAME is stored as a varchar(40). I'm
also assuming that you want a neighbourhood set for the "nearest", even if
the target name doesn't exist, which will be slower than abandoning the
select under this condition.

For flexibility you can write your SP so that the neighbourhood boundaries
can be set dynamically in the procedure call.

Here's how the SELECT procedure would be:

create procedure GetMySet(InputStr varchar(40), N smallint)
returning values (ID_NAME BigInt)
as
declare variable TARGET_ID BigInt = 0;
declare variable COUNTER Integer = 0;
declare variable OUTPUT_BEGUN SmallInt = 0;
begin
/* get the PK value of the first occurrence of neighbourhood member or
nearest preceding */

for select ID_NAME from NAMES
where NAME <= :InputStr
order by NAME desc
into :TARGET_ID do
begin
if (counter = N + 1) then
break;
counter = counter + 1;
end
counter = 0;
-- ----------------------------------------------------------------
-- A - This block may be unnecessarily costly
if (TARGET_ID = 0) then -- no hits, so just get the ID of the first row
begin
for select NAME_ID from NAMES
order by NAME
into :TARGET_ID
do
begin
counter = counter + 1;
if (counter > 0) then
break;
end
end
/* --------------------------------------------------------------
B -This is cheaper
if (TARGET_ID = 0) then Exit;
----------------------------------------------------------------- */
counter = 0;
for select ID_NAME from NAMES
order by NAME asc
into :ID_NAME do
begin
if (counter = 0 and ID_NAME = TARGET_ID) then
OUTPUT_BEGUN = 1;
if (OUTPUT_BEGUN = 1) then
begin
counter = counter + 1;
if (counter <= 1 + (N * 2)) then
suspend; -- output a record
else
Break;
end
end
end

./hb