firebird-support - Re: How firebird use indexes with inner join

Subject	Re: How firebird use indexes with inner join
Author	leroyarnaud29
Post date	2012-08-14T13:08:35Z

> Changed the date/time interval to another 1 hour window with data.
>
> The query executes in ~100ms with an execution plan mentioned above.
> Executing the query in context of a fetch all operation, I get 60
> indexed reads on table A and 30000 indexed reads on B.

i don't understand why it reads 30000 lines in B whereas the query returns 6000 rows, i think only 6000 lines should be read on table "B" but it's normal because it does not use the refwrod index.

Yet i have forced the plan of this queries to PLAN JOIN (A INDEX (A_IDX1), B INDEX (B_IDX1, B_IDX3)) and the result of number of line read is better but
the executed time is worst !

so i think i don't understand how works a inner join internally with index :(

> As the distribution for REFWROD via:
>
> select refwrod, count(*) from b group by refwrod
>
> shows the following:
>
> REFWROD COUNT
> -------------------------------------------------------------
> er 1000000
> op 1000000
> qw 1000000
> ty 1000000
> ui 1000000
>
>
> The above execution statistic ain't bad.
>
> If you are still not happy with the performance and the number of
> indexed reads, a compound index on (REFA, REFWROD) might reduce the
> indexed reads on B even further (haven't tried though).

i test and it works => good numbers of read and i obtain a low execution time
but why it don't use naturally all index on this table (B) like a simple request (select * from b where refa < 1000 and refwrod = 'qw') ?

> Is your example something real from your production scenario?

nearly i have more more rows in each table(1000000 for A and 10000000 for B) and have more different word (about 2000), so it's very long.
because i have to do this kind of proc stock :

For select someword from complex_storedproc where date > :begin and date <
:end goup by refwrod into :word do begin
select sum(nbword) from A INNER JOIN B on id = refa where a.datedoc

>= :begin and a.datedoc <= :end and b.refwrod = :word into :nbwords;

end

so it must read all lines of B between :begin and :end date on each loop.
so in this little exemple if there are 5000 line in A between :begin and :end Date there will be 2500000 read in table B for each word.
and in this case it very slow : you can try with this stored proc

CREATE TABLE WORD (
SOMEWORD VARCHAR(10)
);

INSERT INTO WORD (SOMEWORD)
VALUES ('qw');
INSERT INTO WORD (SOMEWORD)
VALUES ('er');
INSERT INTO WORD (SOMEWORD)
VALUES ('ty');
INSERT INTO WORD (SOMEWORD)
VALUES ('ui');
INSERT INTO WORD (SOMEWORD)
VALUES ('op');

create or alter procedure TEST_A_B (
DEBUT timestamp,
FIN timestamp)
returns (
ID integer)
AS
declare variable foreignk varchar(50);
begin
/* imagine that the result of FOR SELECT is the result of a complex stored proc */
FOR SELECT someword from word into :foreignk do
begin
select count(a.id) from a inner join b on ID = refa
where a.datedoc >= :debut and a.datedoc <= :fin and b.refwrod = :foreignk into :id;
suspend;
end

end

if you take an interval of 4 days we can begin to imagine my problem ;)
in this case your solution with your index on (refa and refword) works too. but it's a heavy solution