firebird-support - Re: IS OPTIMIZER OPTIMIZING ?

Subject	Re: IS OPTIMIZER OPTIMIZING ?
Author	Lista de Discução Interbase
Post date	2002-05-02T15:30:36Z

Hi guys !

I will join the optimizing discussion and ask you something.

First I will show my table structure:

CREATE DOMAIN FATOR AS NUMERIC(9, 4) NOT NULL;
CREATE DOMAIN INTEIRO AS INTEGER NOT NULL;
CREATE DOMAIN MEMO AS VARCHAR(2000);
CREATE DOMAIN PRECO AS NUMERIC(15, 4) NOT NULL;
CREATE DOMAIN QUANTIDADE AS NUMERIC(15, 4) NOT NULL;
CREATE DOMAIN STRING40 AS VARCHAR(40) NOT NULL;
CREATE DOMAIN STRING80 AS VARCHAR(80) NOT NULL;
CREATE DOMAIN TABLEID AS INTEGER NOT NULL;

Table ServicoComposicao has 17404 records
/* Table: SERVICOCOMPOSICAO, Owner: SYSDBA */

CREATE TABLE SERVICOCOMPOSICAO
(
SERVICOCOMPOSICAOID TABLEID,
SERVICOID TABLEID,
ITEM INTEIRO,
PRODUTOID TABLEID,
QUANTIDADE QUANTIDADE,
FATOR FATOR,
CONSTRAINT PK_SERVICOCOMPOSICAO PRIMARY KEY (SERVICOCOMPOSICAOID)
);
ALTER TABLE SERVICOCOMPOSICAO ADD CONSTRAINT FK_SERVICOCOMPOSICAO_PRODUTO
FOREIGN KEY (PRODUTOID) REFERENCES PRODUTO (PRODUTOID);
ALTER TABLE SERVICOCOMPOSICAO ADD CONSTRAINT FK_SERVICOCOMPOSICAO_SERVICO
FOREIGN KEY (SERVICOID) REFERENCES SERVICO (SERVICOID);

Table LinhaProduto has 186 records

/* Table: LINHAPRODUTO, Owner: SYSDBA */

CREATE TABLE LINHAPRODUTO
(
LINHAPRODUTOID TABLEID,
DESCRICAO STRING80,
CONSTRAINT PK_LINHAPRODUTO PRIMARY KEY (LINHAPRODUTOID)
);

Table Produto has 3139 records

/* Table: PRODUTO, Owner: SYSDBA */

CREATE TABLE PRODUTO
(
PRODUTOID TABLEID,
LINHAPRODUTOID TABLEID,
CODIGO STRING40,
DESCRICAO STRING80,
DESCRICAOCOMPLETA MEMO,
ESTOQUEFISICO QUANTIDADE,
QUANTIDADETRANSITO QUANTIDADE,
ESTOQUEMINIMO QUANTIDADE,
PRECOCUSTO PRECO,
UNIDADEID TABLEID,
CMP_QUANTIDADEDISPONIVEL QUANTIDADE,
CONSTRAINT PK_PRODUTO PRIMARY KEY (PRODUTOID)
);
ALTER TABLE PRODUTO ADD CONSTRAINT FK_PRODUTO_LINHAPRODUTO FOREIGN KEY
(LINHAPRODUTOID) REFERENCES LINHAPRODUTO (LINHAPRODUTOID);

The above tables are small, every one has a PK field (ServicoID, ProdutoID,
LinhaProdutoID) type integer generated by generators, so small key with
high selectivity.

I'd like to understand a little bit on how optimizer works. Lets show some
querys.

select
SC.ServicoID, LP.LinhaProdutoID, LP.Descricao, sum(SC.Quantidade *
P.PrecoCusto)
from
ServicoComposicao SC join
Produto P on (P.ProdutoID = ServicoComposicao.ProdutoID) join
LinhaProduto LP on (LP.LinhaProdutoID = P.LinhaProdutoID)
where
SC.ServicoID = 20310
group by
SC.ServicoID, LP.LinhaProdutoID, LP.Descricao

PLAN SORT (JOIN (SC INDEX (AK_SERVICOCOMPOSICAO_ITEM),P INDEX
(RDB$PRIMARY10),LP INDEX (RDB$PRIMARY8)))

Execution Time: 00:00:00.0010 (1st time)
Execution Time: 00:00:00.0000 (2nd time)
Execution Time: 00:00:00.0000 (3rd time)
Execution Time: 00:00:00.0000 (4rd time)

This query returns 7 rows.

the result is instantaneous. no problem here.

But if I create a view like this:
CREATE VIEW VSERVICOCOMPOSICAOLINHAPRODUTO (
ID,
LINHAPRODUTOID,
LINHAPRODUTODESCRICAO,
CUSTOTOTAL
) AS

select
SC.ServicoID, LP.LinhaProdutoID, LP.Descricao, sum(SC.Quantidade *
P.PrecoCusto)
from
ServicoComposicao SC join
Produto P on (P.ProdutoID = ServicoComposicao.ProdutoID) join
LinhaProduto LP on (LP.LinhaProdutoID = P.LinhaProdutoID)
group by
SC.ServicoID, LP.LinhaProdutoID, LP.Descricao;

and send the following query:

Select * from vServicoComposicaoLinhaProduto where ID = 20310

the plan is:
PLAN SORT (JOIN (VSERVICOCOMPOSICAOLINHAPRODUTO LP
NATURAL,VSERVICOCOMPOSICAOLINHAPRODUTO P INDEX
(RDB$FOREIGN28),VSERVICOCOMPOSICAOLINHAPRODUTO SC INDEX (RDB$FOREIGN23)))

Execution Time: 00:00:00.0631 (1st time)
Execution Time: 00:00:00.0621 (2nd time)
Execution Time: 00:00:00.0641 (3rd time)
Execution Time: 00:00:00.0641 (4rd time)

This query returns 7 rows too (of course).

I know the response time is tiny, but I can see a big diference when run
the query, the first just appears, the second one I see the delay.
Comparing the execution time reported by IBConsole the first one is from 60
times faster to infinitum times faster :-). And I hope these tables will
grow up to 10 times bigger than it was now.

Why optimizer choose to do a natural search on table LinhaProduto, why not
choose the PK Index of table LinhaProduto ?

I think the plan of the two queries should be the same, since I am doing
the same query, just that the second is trough a view, I thought tyhat the
view just encapsulates the query making it more readable, when someone send
a query using a view the "engine" substitute the view by the equivalent
statment, but now I am sure I am wrong with this. As far I can recall, I do
some tests sending to queries like the above in MSSQL and always the plan
is the same.

I noticed another strange (in my opinion) behavior of the engine, I am
sorry, but I need to find it again to show you, the optimizer choose to do
a natural search on one table and I cannot force the plan to use an index
and the engine only execute the query in a natural way.

Below are some statistics from the database:

Database header page information:
Flags 0
Checksum 12345
Generation 84
Page size 4096
ODS version 10.0
Oldest transaction 65
Oldest active 66
Oldest snapshot 66
Next transaction 78
Bumped transaction 1
Sequence number 0
Next attachment ID 0
Implementation ID 16
Shadow count 0
Page buffers 0
Next header page 0
Database dialect 1
Creation date May 2, 2002 11:47:58

LINHAPRODUTO (134)
Primary pointer page: 177, Index root page: 178
Data pages: 4, data page slots: 4, average fill: 63%
Fill distribution:
0 - 19% = 0
20 - 39% = 0
40 - 59% = 1
60 - 79% = 3
80 - 99% = 0

Index RDB$PRIMARY8 (0)
Depth: 1, leaf buckets: 1, nodes: 186
Average data length: 1.00, total dup: 0, max dup: 0
Fill distribution:
0 - 19% = 0
20 - 39% = 1
40 - 59% = 0
60 - 79% = 0
80 - 99% = 0

Index SK_LINHAPRODUTO_DESCRICAO (1)
Depth: 1, leaf buckets: 1, nodes: 186
Average data length: 13.00, total dup: 47, max dup: 17
Fill distribution:
0 - 19% = 0
20 - 39% = 0
40 - 59% = 0
60 - 79% = 0
80 - 99% = 1

PRODUTO (136)
Primary pointer page: 181, Index root page: 182
Data pages: 149, data page slots: 149, average fill: 86%
Fill distribution:
0 - 19% = 0
20 - 39% = 0
40 - 59% = 1
60 - 79% = 0
80 - 99% = 148

Index AK_PRODUTO_CODIGO (1)
Depth: 2, leaf buckets: 6, nodes: 3139
Average data length: 1.00, total dup: 0, max dup: 0
Fill distribution:
0 - 19% = 0
20 - 39% = 0
40 - 59% = 0
60 - 79% = 1
80 - 99% = 5

Index RDB$FOREIGN28 (3)
Depth: 2, leaf buckets: 5, nodes: 3139
Average data length: 0.00, total dup: 2965, max dup: 86
Fill distribution:
0 - 19% = 0
20 - 39% = 0
40 - 59% = 0
60 - 79% = 1
80 - 99% = 4

Index RDB$PRIMARY10 (0)
Depth: 2, leaf buckets: 6, nodes: 3139
Average data length: 1.00, total dup: 0, max dup: 0
Fill distribution:
0 - 19% = 0
20 - 39% = 0
40 - 59% = 1
60 - 79% = 0
80 - 99% = 5

Index SK_PRODUTO_DESCRICAO (2)
Depth: 2, leaf buckets: 18, nodes: 3139
Average data length: 16.00, total dup: 40, max dup: 24
Fill distribution:
0 - 19% = 0
20 - 39% = 0
40 - 59% = 0
60 - 79% = 1
80 - 99% = 17

SERVICOCOMPOSICAO (130)
Primary pointer page: 168, Index root page: 169
Data pages: 303, data page slots: 303, average fill: 67%
Fill distribution:
0 - 19% = 0
20 - 39% = 0
40 - 59% = 0
60 - 79% = 303
80 - 99% = 0

Index AK_SERVICOCOMPOSICAO_ITEM (1)
Depth: 2, leaf buckets: 34, nodes: 17404
Average data length: 1.00, total dup: 0, max dup: 0
Fill distribution:
0 - 19% = 1
20 - 39% = 0
40 - 59% = 0
60 - 79% = 0
80 - 99% = 33

Index RDB$FOREIGN22 (2)
Depth: 2, leaf buckets: 27, nodes: 17404
Average data length: 0.00, total dup: 13626, max dup: 45
Fill distribution:
0 - 19% = 0
20 - 39% = 0
40 - 59% = 0
60 - 79% = 0
80 - 99% = 27

Index RDB$FOREIGN23 (3)
Depth: 2, leaf buckets: 27, nodes: 17404
Average data length: 0.00, total dup: 14607, max dup: 1172
Fill distribution:
0 - 19% = 0
20 - 39% = 0
40 - 59% = 0
60 - 79% = 1
80 - 99% = 26

Index RDB$PRIMARY4 (0)
Depth: 2, leaf buckets: 31, nodes: 17404
Average data length: 1.00, total dup: 0, max dup: 0
Fill distribution:
0 - 19% = 1
20 - 39% = 0
40 - 59% = 0
60 - 79% = 0
80 - 99% = 30

The results are the same on Firebird 1 on WinXP and on Firebird
1.0.0-0a.Beta2 on RedHat Linux.

Sorry for my bad English and for the long message, and I hope you could
understand me.

[]s

Alexandre