Subject | Re: [firebird-support] Performance Question |
---|---|
Author | Helen Borrie |
Post date | 2004-08-07T23:59:25Z |
At 01:47 PM 7/08/2004 -0700, you wrote:
the plan for both queries would be identical: it would choose to walk a
small set using natural order if it is cheaper than using an index.
An EXISTS() query will beat a COUNT(*) query in a bigger set, because it
exits as soon as it finds an eligible row. OTOH, if there is no eligible
row, the number of rows walked will be exactly the same in both cases, i.e.
EXISTS() won't know the answer until it has walked the whole set. If it's
a big set with a useable index on the search criteria, the EXISTS() query
will win.
A tip for EXISTS() queries: use "EXISTS(SELECT 1 FROM....) rather than
SELECT *, since you're not returning a set.
It's essential to watch the plans when you're playing about with
performance, and to think about your indexes and watch the statistics of
the indexes selected by the optimiser. Also, where the test data catchment
is as small as this, while the production data is potentially huge, it's
also essential to get hold of a test data generator and build yourself some
suitably sized sets of test data.
/heLen
>For some reason I expected this first query to be faster than theSure. The method chosen by the optimiser is cost-based. It's likely that
>second. However, they appear to be about the same. I have ten of
>thousands of records. The first query actual performs more fetches.
>Also, if interested, there are no table scans in the plan for the query.
>
>Using Exists
>============
>SELECT 1
>FROM RDB$DATABASE
>WHERE EXISTS(
> SELECT *
> FROM Groupable
> INNER JOIN NodeGroup ON Groupable.oid = NodeGroup.oid
> INNER JOIN Link ON NodeGroup.id = Link.childId
> INNER JOIN Path ON Link.parentId = Path.descendentId
> WHERE Path.ancestorId = 1 AND Link.childId = 3)
>
>Using Count
>===========
>SELECT COUNT(*)
>FROM Groupable
> INNER JOIN NodeGroup ON Groupable.oid = NodeGroup.oid
> INNER JOIN Link ON NodeGroup.id = Link.childId
> INNER JOIN Path ON Link.parentId = Path.descendentId
>WHERE Path.ancestorId = 1 AND Link.childId = 3
>
>The Count returns 100. Any thoughts?
the plan for both queries would be identical: it would choose to walk a
small set using natural order if it is cheaper than using an index.
An EXISTS() query will beat a COUNT(*) query in a bigger set, because it
exits as soon as it finds an eligible row. OTOH, if there is no eligible
row, the number of rows walked will be exactly the same in both cases, i.e.
EXISTS() won't know the answer until it has walked the whole set. If it's
a big set with a useable index on the search criteria, the EXISTS() query
will win.
A tip for EXISTS() queries: use "EXISTS(SELECT 1 FROM....) rather than
SELECT *, since you're not returning a set.
It's essential to watch the plans when you're playing about with
performance, and to think about your indexes and watch the statistics of
the indexes selected by the optimiser. Also, where the test data catchment
is as small as this, while the production data is potentially huge, it's
also essential to get hold of a test data generator and build yourself some
suitably sized sets of test data.
/heLen