Subject | Re: Query Times for Firebird. Issue with where claus? |
---|---|
Author | Svein Erling Tysvær |
Post date | 2006-05-30T08:16:19Z |
Hmm, you really ought to tell us the respective plans generated by the
optimizer, without them it feels like looking for the needle in the
haystack - without knowing where the haystack is! (ok, I am
exaggerating a bit)
Looking at your query, I notice two things.
You use SQL-89 and not SQL-92.
In your case it doesn't matter much, it is cleaner with SQL-92 and may
make things easier for whichever components you use (my experience is
limited to IBO, and I know that with SQL-89 you often have to specify
one extra attribute called JoinLinks). Though this is not the cause of
your problem.
Parts of your query cannot use indexes: (BERN1.DOB_DAY -
BERN2.DOB_DAY) < 2 cannot possibly use any index (in your original
query, you use the same field twice, but I expect this to just be a
spelling mistake), but it is simple to change to something that may
use an index. This may be the needle if I'm in a haystack, but quite
frankly - it is just guessing without any PLANs and knowledge about
your indexes (normally, the problem is either using too many or too
few indexes, for once I guessed on using too few).
So I suggest you try to change to
select
count(*)
from BERN1
JOIN BERN2 ON BERN1.GENDER = BERN2.GENDER
WHERE
(
(
BERN1.DOB_YEAR = BERN2.DOB_YEAR
AND
(
BERN1.DOB_MO = BERN2.DOB_MO OR
BERN1.DOB_DAY = BERN2.DOB_DAY OR
BERN1.DOB_MO = BERN2.DOB_DAY
)
)
AND /* THIS IS THE PROBLEM LINE *****************/
(
BERN1.DOB_DAY BETWEEN BERN2.DOB_DAY-1 AND BERN2.DOB_DAY+1 AND
BERN1.DOB_MO = BERN2.DOB_MO AND
BERN1.DOB_YEAR = BERN2.DOB_YEAR
)
);
Actually, I'd normally put the DOB_YEAR outside the "problem area",
but if you use combined indexes rather than single field indexes, this
may be a very bad idea - so I didn't change that bit.
I've never done any work on tables with 30 million rows, but I often
join two or three tables with about 1 million rows each. What matters
is how unique indexes you can use for the joining. I would never join
that many records on fields like - say - COUNTRY, whereas joining on
almost unique fields like PERSON is a piece of cake.
HTH,
Set
optimizer, without them it feels like looking for the needle in the
haystack - without knowing where the haystack is! (ok, I am
exaggerating a bit)
Looking at your query, I notice two things.
You use SQL-89 and not SQL-92.
In your case it doesn't matter much, it is cleaner with SQL-92 and may
make things easier for whichever components you use (my experience is
limited to IBO, and I know that with SQL-89 you often have to specify
one extra attribute called JoinLinks). Though this is not the cause of
your problem.
Parts of your query cannot use indexes: (BERN1.DOB_DAY -
BERN2.DOB_DAY) < 2 cannot possibly use any index (in your original
query, you use the same field twice, but I expect this to just be a
spelling mistake), but it is simple to change to something that may
use an index. This may be the needle if I'm in a haystack, but quite
frankly - it is just guessing without any PLANs and knowledge about
your indexes (normally, the problem is either using too many or too
few indexes, for once I guessed on using too few).
So I suggest you try to change to
select
count(*)
from BERN1
JOIN BERN2 ON BERN1.GENDER = BERN2.GENDER
WHERE
(
(
BERN1.DOB_YEAR = BERN2.DOB_YEAR
AND
(
BERN1.DOB_MO = BERN2.DOB_MO OR
BERN1.DOB_DAY = BERN2.DOB_DAY OR
BERN1.DOB_MO = BERN2.DOB_DAY
)
)
AND /* THIS IS THE PROBLEM LINE *****************/
(
BERN1.DOB_DAY BETWEEN BERN2.DOB_DAY-1 AND BERN2.DOB_DAY+1 AND
BERN1.DOB_MO = BERN2.DOB_MO AND
BERN1.DOB_YEAR = BERN2.DOB_YEAR
)
);
Actually, I'd normally put the DOB_YEAR outside the "problem area",
but if you use combined indexes rather than single field indexes, this
may be a very bad idea - so I didn't change that bit.
I've never done any work on tables with 30 million rows, but I often
join two or three tables with about 1 million rows each. What matters
is how unique indexes you can use for the joining. I would never join
that many records on fields like - say - COUNTRY, whereas joining on
almost unique fields like PERSON is a piece of cake.
HTH,
Set
--- In firebird-support@yahoogroups.com, "lec_sas" wrote:
> Me again,
> I am doing some volume testing and I ran across the following
> issue. The query I have pasted below runs against two tables. Bern1
> having 3730 records, Bern2 having 2987. The query is a bit of a
> dummy query but the idea was to mock some of our old test queries
> for some time comparisons.
>
> This query executes fine in about .2-.5 seconds using the embeded
> version. However, when I alter the identified "AND" condition
> to "OR", the query takes anywhere from 12-17 seconds. I ported the
> syntax and ran it in SQL Server 2005 and my query times were the
> same for both (or very close).
>
> Could this perhaps be an issue with the way FireBird interprets
> the "OR'S" in the where clause?
>
> Is there any possible suggestions to how this can be optimized to
> avoid such results? In our production environments, worst case
> scenario could have a query running against a pair of 30 million
> record tables, so 12-17 seconds with such a small set of data raises
> concerns.
>
> Has there been anyone in the past do some major volume testing whith
> those kinds of numbers before (30 mil). If so, care to share some
> results?
>
> Query:
>
> select
> count(*)
> from
> BERN1,
> BERN2
> where
> (BERN1.GENDER = BERN2.GENDER)
> AND
> (
> (
> BERN1.DOB_YEAR = BERN2.DOB_YEAR
> AND
> (
> BERN1.DOB_MO = BERN2.DOB_MO OR
> BERN1.DOB_DAY = BERN2.DOB_DAY OR
> BERN1.DOB_MO = BERN2.DOB_DAY
> )
> )
> AND /* THIS IS THE PROBLEM LINE *****************/
> (
> (BERN1.DOB_DAY - BERN1.DOB_DAY) < 2 AND
> (BERN2.DOB_DAY - BERN1.DOB_DAY) < 2 AND
> BERN1.DOB_MO = BERN2.DOB_MO AND
> BERN1.DOB_YEAR = BERN2.DOB_YEAR
> )
> );
>
> Thanks
>
> Darin Amos