Subject | Re: [firebird-support] (unknown) |
---|---|
Author | Alexandre Benson Smith |
Post date | 2012-07-11T19:48:36Z |
Em 11/7/2012 14:23, K Z escreveu:
the size of the tables would impact on the time to get the records. I
think that if you just filter using CLIENTE like '%crist%' is faster
because it starts by this table, but when you put the fields from the
other tables the optimizer thinks that will be faster to start from
table PAIS for example, and thus the final result is slow. There is no
way to make this kind of query to be fast, it simply cannot use any
index to speed it up and worse sicne you are searching for the same
information on fields from a bunch of tables the optimizer will choose
very bad plans because it will start from the table with fewer rows even
when the data is in fact on the larger table. CONTAINING will no help as
you already experienced and the stored procedure will not help either...
You could "fool" the optimizer forcing the table CLIENTES to be the
first one scanned using a query like:
CREATE VIEW LISTA_CLIENTES (COND_PAGAMENTO_DESCRICAO, PAIS_NOME, IDCLIENTES, CLIENTE, MORADA, CP, LOCALIDADE, PAIS, TELEFONE, FAX, EMAIL, CONDICOES_PAGAMENTO, REGIME_IVA, PER_DESCONTO, NUNCA_BLOQUEAR, CONTRIBUINTE, COD_CLIENTE_FAT, ATIVO, VALIDADO)
AS
select cp.DESCRICAO COND_PAGAMENTO_DESCRICAO,p.NOME PAIS_NOME,c.*
FROM CLIENTES c
left join PAISES p on p.CODIGO_ISO_3166_1=c.PAIS
inner join COND_PAGAMENTO cp on
cp.IDCOND_PAGAMENTO=c.CONDICOES_PAGAMENTO
order by c.CLIENTE;
*But* it will be slower if the real searched data resides on table PAIS...
I use a diferent approach to help the user to search for records, first
the user point wich field it wants to search (for example the Name) then
I do a search in 3 steps:
1.) search for the exact name:
Select * from Lista_Clientes where Cliente = 'something'
if I find just one record, I will return this value, if there is more
than one I would present a grid, so the user could choose from the
available records, if there is no record I will go to step 2
2.) search for the beginning
Select * from Lista_Clientes where Cliente like 'something%'
if I find just one record, I will return this value, if there is more
than one I would present a grid, so the user could choose from the
available records, if there is no record I will go to step 3
3.) sarch for the string in anywhere in the specified field
Select * from Lista_Clientes where Cliente like '%something%'
if I find just one record, I will return this value, if there is more
than one I would present a grid, so the user could choose from the
available records, if there is no record the user can try another search
The step 1 and 2 are really fast no matter what the size of the table
because it will use an index on the field, step 3 will be slow since no
index could be used, but in the vast majority of the cases step 1 and 2
will return the desired records, step 3 will not be executed very often.
As an aditional feature user could put wildcards on the searched string,
if the search string contains any wildcard the step 1 is bypassed and if
the first character of the string is an wildcard the routine would
bypass step 1 and 2 and go directly for step 3.
I know that what you are trying to achieve is friendly to the user, he
just throw some data and the system tries to find it anywhere, but it
will not be fast. IMHO you should think about the general case, how
often the user want to find a costumer by a string but don't know if
that string is a name or a country or a phone number ??? how much bogus
information will it return ? (ex. costumers who has that piece of the
string on the street address and the user are searching for the name of
the company), I think it will return a lot of uninteresting information.
see you !
>This kind of search criteria (like '%crist%') will not use an index, so
> Hello,
>
> I'm still far from a expert on firebird and sql. If someone can help me to improve this code, i will be grateful.
>
> 1) GOAL: I need to make a search for a partial word in several fields of the customers table.
>
> 2) STRUCTURE: I am using a view because i join the customers table with the country names table and payment conditions table.
>
>
> CREATE TABLE CLIENTES
> (
> IDCLIENTES INTEGER NOT NULL,
> CLIENTE VARCHAR(100) NOT NULL COLLATE UNICODE_CI_AI,
> MORADA VARCHAR(100) NOT NULL COLLATE UNICODE_CI_AI,
> CP VARCHAR(10) DEFAULT NULL COLLATE UNICODE_CI_AI,
> LOCALIDADE VARCHAR(50) NOT NULL COLLATE UNICODE_CI_AI,
> PAIS VARCHAR(2) DEFAULT 'PT' NOT NULL COLLATE UNICODE_CI_AI,
> TELEFONE VARCHAR(15) DEFAULT NULL COLLATE UNICODE_CI_AI,
> FAX VARCHAR(15) DEFAULT NULL COLLATE
> UNICODE_CI_AI,
> EMAIL VARCHAR(100) DEFAULT NULL COLLATE UNICODE_CI_AI,
> CONDICOES_PAGAMENTO SMALLINT DEFAULT 1 NOT NULL,
> REGIME_IVA VARCHAR(1) DEFAULT 'G' NOT NULL COLLATE UNICODE_CI_AI,
> PER_DESCONTO DECIMAL(6,2) DEFAULT 0 NOT NULL,
> NUNCA_BLOQUEAR VARCHAR(1) DEFAULT 'F' NOT NULL COLLATE UNICODE_CI_AI,
> CONTRIBUINTE VARCHAR(25) DEFAULT NULL COLLATE UNICODE_CI_AI,
> COD_CLIENTE_FAT VARCHAR(15) DEFAULT NULL COLLATE UNICODE_CI_AI,
> ATIVO VARCHAR(1) DEFAULT 'T' NOT NULL COLLATE UNICODE_CI_AI,
> VALIDADO VARCHAR(1) DEFAULT 'F' NOT NULL COLLATE UNICODE_CI_AI,
> CONSTRAINT INTEG_26 PRIMARY KEY (IDCLIENTES)
> );
>
> CREATE INDEX IDX_CLIENTES1 ON CLIENTES (CLIENTE);
> CREATE INDEX IDX_CLIENTES2 ON CLIENTES (PAIS);
> CREATE INDEX IDX_CLIENTES3 ON CLIENTES (CONDICOES_PAGAMENTO);
>
> CREATE TABLE COND_PAGAMENTO
> (
> IDCOND_PAGAMENTO INTEGER NOT NULL,
> DESCRICAO VARCHAR(45) NOT NULL COLLATE UNICODE_CI_AI,
> QTD_DIAS
> SMALLINT DEFAULT 0,
> ENTREGA_PAGAMENTO VARCHAR(1) DEFAULT 'T' NOT NULL COLLATE UNICODE_CI_AI,
> CONSTRAINT INTEG_14 PRIMARY KEY (IDCOND_PAGAMENTO)
> );
>
> CREATE TABLE PAISES
> (
> IDPAISES INTEGER NOT NULL,
> NOME VARCHAR(60) NOT NULL COLLATE UNICODE_CI_AI,
> CODIGO_ISO_3166_1 VARCHAR(2) NOT NULL COLLATE UNICODE_CI_AI,
> VALOR_PORTES DECIMAL(10,2) DEFAULT NULL,
> ISENCAO_PORTES DECIMAL(10,2) DEFAULT NULL,
> CONSTRAINT INTEG_79 PRIMARY KEY (IDPAISES)
> );
>
> CREATE VIEW LISTA_CLIENTES (COND_PAGAMENTO_DESCRICAO, PAIS_NOME, IDCLIENTES, CLIENTE, MORADA, CP, LOCALIDADE, PAIS, TELEFONE, FAX, EMAIL, CONDICOES_PAGAMENTO, REGIME_IVA, PER_DESCONTO, NUNCA_BLOQUEAR, CONTRIBUINTE, COD_CLIENTE_FAT, ATIVO, VALIDADO)
> AS
> select cp.DESCRICAO COND_PAGAMENTO_DESCRICAO,p.NOME PAIS_NOME,c.*
> FROM CLIENTES c
> inner join PAISES p on p.CODIGO_ISO_3166_1=c.PAIS
> inner join COND_PAGAMENTO cp on
> cp.IDCOND_PAGAMENTO=c.CONDICOES_PAGAMENTO
> order by c.CLIENTE;
>
>
> 3) THE SLOW QUERY: From Flamerobin i run the following query and is take 60 seconds to finished:
>
> select first 20 * from LISTA_CLIENTES where (CLIENTE like '%crist%' or MORADA like '%crist%' or LOCALIDADE like '%crist%' or CLIENTE like '%crist%' or PAIS like '%crist%' or PAIS_NOME like '%crist%' or TELEFONE like '%crist%' or EMAIL like '%crist%' or CONTRIBUINTE like '%crist%' or COND_PAGAMENTO_DESCRICAO like '%crist%') order by CLIENTE
>
> If i change the query to:
>
> select first 20 * from LISTA_CLIENTES where (CLIENTE like '%crist%') order by CLIENTE
>
> it runs much faster. But as more fields i had in the WHERE clause, more slow it gets. I try using the CONTAINING function instead of LIKE but i got no improvement. I also tried to create a PROCEDURE to speed up but again not speed up on getting the result.
>
> I use this kind of code with other tables and
> it runs very fast. I also tried using MySQL for comparing and this exact query runs in milliseconds. I tried to read docs in the Firebird site and still no improvements.
> Can anyone help me with this problem?
>
> Thank you.
>
>
the size of the tables would impact on the time to get the records. I
think that if you just filter using CLIENTE like '%crist%' is faster
because it starts by this table, but when you put the fields from the
other tables the optimizer thinks that will be faster to start from
table PAIS for example, and thus the final result is slow. There is no
way to make this kind of query to be fast, it simply cannot use any
index to speed it up and worse sicne you are searching for the same
information on fields from a bunch of tables the optimizer will choose
very bad plans because it will start from the table with fewer rows even
when the data is in fact on the larger table. CONTAINING will no help as
you already experienced and the stored procedure will not help either...
You could "fool" the optimizer forcing the table CLIENTES to be the
first one scanned using a query like:
CREATE VIEW LISTA_CLIENTES (COND_PAGAMENTO_DESCRICAO, PAIS_NOME, IDCLIENTES, CLIENTE, MORADA, CP, LOCALIDADE, PAIS, TELEFONE, FAX, EMAIL, CONDICOES_PAGAMENTO, REGIME_IVA, PER_DESCONTO, NUNCA_BLOQUEAR, CONTRIBUINTE, COD_CLIENTE_FAT, ATIVO, VALIDADO)
AS
select cp.DESCRICAO COND_PAGAMENTO_DESCRICAO,p.NOME PAIS_NOME,c.*
FROM CLIENTES c
left join PAISES p on p.CODIGO_ISO_3166_1=c.PAIS
inner join COND_PAGAMENTO cp on
cp.IDCOND_PAGAMENTO=c.CONDICOES_PAGAMENTO
order by c.CLIENTE;
*But* it will be slower if the real searched data resides on table PAIS...
I use a diferent approach to help the user to search for records, first
the user point wich field it wants to search (for example the Name) then
I do a search in 3 steps:
1.) search for the exact name:
Select * from Lista_Clientes where Cliente = 'something'
if I find just one record, I will return this value, if there is more
than one I would present a grid, so the user could choose from the
available records, if there is no record I will go to step 2
2.) search for the beginning
Select * from Lista_Clientes where Cliente like 'something%'
if I find just one record, I will return this value, if there is more
than one I would present a grid, so the user could choose from the
available records, if there is no record I will go to step 3
3.) sarch for the string in anywhere in the specified field
Select * from Lista_Clientes where Cliente like '%something%'
if I find just one record, I will return this value, if there is more
than one I would present a grid, so the user could choose from the
available records, if there is no record the user can try another search
The step 1 and 2 are really fast no matter what the size of the table
because it will use an index on the field, step 3 will be slow since no
index could be used, but in the vast majority of the cases step 1 and 2
will return the desired records, step 3 will not be executed very often.
As an aditional feature user could put wildcards on the searched string,
if the search string contains any wildcard the step 1 is bypassed and if
the first character of the string is an wildcard the routine would
bypass step 1 and 2 and go directly for step 3.
I know that what you are trying to achieve is friendly to the user, he
just throw some data and the system tries to find it anywhere, but it
will not be fast. IMHO you should think about the general case, how
often the user want to find a costumer by a string but don't know if
that string is a name or a country or a phone number ??? how much bogus
information will it return ? (ex. costumers who has that piece of the
string on the street address and the user are searching for the name of
the company), I think it will return a lot of uninteresting information.
see you !