Subject Re: Indexes on subquery
Author hosodaamiya
Hi All

I want to hear that the "IDX_TEST1" is not used.

IDX_TEST1 seems to be not used for Query3 and Query4,
though both of IDX_TEST1 and Index in SUBQUERYTABLE (RDB$PRIMARY14) are used in MS SQL Server.
Doesn't work IDX_TEST1 in queries like Query3 and Query4 in Firebird?

[Table Definition]
CREATE TABLE TEST
(
ID Bigint NOT NULL,
ID2 Bigint NOT NULL,
PRIMARY KEY (ID)
);
CREATE INDEX IDX_TEST1 ON TEST (ID2);
CREATE TABLE SUBQUERYTABLE
(
ID2 Bigint NOT NULL,
PRIMARY KEY (ID2)
);

[Query3]
select id from TEST T where exists(select * from SUBQUERYTABLE sq where sq.id2 = t.id2)
[Query plan]
Preparing query: select id from TEST T where exists(select * from SUBQUERYTABLE sq where sq.id2 = t.id2)
Prepare time: 0.000s
Field #01: TEST.ID Alias:ID Type:BIGINT
PLAN (SQ INDEX (RDB$PRIMARY14))
PLAN (T NATURAL)
Script execution finished.

[Query4]
select t.ID,t.ID2 from TEST T
where t.ID2 in (select id2 from SUBQUERYTABLE sb where t.id2 = sb.ID2)
[Query plan]
Preparing query: select t.ID,t.ID2 from TEST T
where t.ID2 in (select id2 from SUBQUERYTABLE sb where t.id2 = sb.ID2)
Prepare time: 0.011s
Field #01: TEST.ID Alias:ID Type:BIGINT
Field #02: TEST.ID2 Alias:ID2 Type:BIGINT
PLAN (SB INDEX (RDB$PRIMARY14))
PLAN (T NATURAL)
Script execution finished.


--- In firebird-support@yahoogroups.com, Svein Erling Tysvær <svein.erling.tysvaer@...> wrote:
>
> >Hey Set!
> >
> >>> [Query1]
> >>> select id from TEST where id2 in (select id2 from subquerytable);
> >>>
> >>> [Query2]
> >>> select id from TEST where id2 in (1,2);
> >
> >> The subselect in query1 cannot benefit from any index since it doesn't have a
> >> WHERE clause. More importantly Query1 is not a good query to use in
> >> Firebird. The subselect is logically executed once for every record in TEST.
> >> This is done because similar queries like
> >>
> >> select id from TEST where id2 in (select id2 from subquerytable where
> >> TEST.Name = subquerytable.Name)
> >>
> >> would have to execute the subquery for every row in TEST.
> >>
> >> Rewrite your query to
> >>
> >> select id from TEST T where exists(select * from subquerytable sq where
> >> sq.id2 = t.id2)
> >>
> >> This query would use an index for subquerytable.
> >
> >It is extraordinary for me to question the details in your posts, they are usually the 'gold standard'.
> >
> >But in this case I think you are off the mark.
> >
> >Without a filter on the TEST table which uses an indexed condition (or an ORDER BY using an index), the optimizer would always
> >perform a NATURAL scan of the TEST table, since it must test each row for the presence of a match in the sub-query.
> >
> >There is nothing in the query criteria which would cause the optimizer to use an index for the TEST table.
>
> Hi Sean!
>
> 'Gold standard' was long ago, the last couple of years I've more or less repeated my old answers to new questions, unfortunately blurred by my failing memory.
>
> I agree with everything you say about TEST, I just don't think that what I wrote would indicate that an index on TEST could be used.
>
> I simply tried to say that something like
> PLAN JOIN (TEST NATURAL, SubQueryTable INDEX(ID2_Index))
> is a lot better than
> PLAN JOIN (TEST NATURAL, SubQueryTable NATURAL)
> which I would expect for [Query1] (NATURAL on the first tuple in a PLAN is normally a lot better than NATURAL on a subsequent tuple).
>
> Though if JOIN is a valid option (i.e. no multiple matches in SubQueryTable), that would give the optimizer the additional option of
> PLAN JOIN (SubQueryTable NATURAL, TEST INDEX(ID2_Index))
> and be easier to read and at least as quick as using EXISTS (well, unless the optimizer makes a mistake).
>
> Set
>