Subject Re: [firebird-support] Query with aggregation is slow!
Author Helen Borrie
At 09:52 PM 3/12/2009, kadriizer wrote:
>Hi All,
>
>Below, I have a query and performance information ...
>
>SELECT * FROM (
> SELECT
> L.ID,
> O.ORDERDATE,
> T.TXT CATEGORY,
> P.TITLE,
> P.PRICE UNITPRICE,
> L.QUANTITY,
> C.FIRSTNAME, C.LASTNAME,
> C.CITY,
> O.NETAMOUNT, O.TAX, O.TOTALAMOUNT
> FROM ORDERLINE L /* has 60222 records */
> LEFT JOIN ORDERS O ON L.ORDERS_ID = O.ID /* has 20000 records */
> LEFT JOIN PRODUCT P ON L.PRODUCT_ID = P.ID /* has 10000 records */
> LEFT JOIN CATEGORY T ON P.CATEGORY_ID = T.ID /* has 16 records */
> LEFT JOIN CUSTOMER C ON O.CUSTOMER_ID = C.ID /* has 10000 records */
> WHERE L.ID > 0
>) AS TBL
>
>------ Performance info ------
>Prepare time = 0ms
>Execute time = 16ms

This is not an aggregation (referring to Subject). What you are doing here is creating a derived table and selecting all of its output. Can you explain what is the purpose of this?


>This query, "select * from" portion, as "select count (*) from" is changed, the performance information is as follows ...
>
>------ Performance info ------
>Prepare time = 0ms
>Execute time = 1s 904ms
>
>First query is faster than second query! Why?

The two queries are not similar, except that both are inefficient.

In the select * version, you are first creating the derived table and then going through the output and testing each row by the search expression.

Because count(*) on a non-aggregated query walks the entire set. If you want aggregated counts then you need to use GROUP BY.

BTW, do you actually have any records in ORDERLINE that are likely to be zero or negative, or null? If not, it won't make sense to test the output for >0, because ORDERLINE is the leftmost stream of the left joins.

You wrote:
>Attention to the developers!
>MSSQL can do this very well ;-)

This is fine if "very well" means "very fast" and you just want a snapshot view of the record count when you started. MSSQL maintains a count of committed records right in the table, so its accuracy is unstable at best. It's hard to know why they do it....but possibly to enable a quick estimate for loading up a client-side GUI mechanism like a progress bar.

From a design perspective, there's no reason to do a non-aggregated record count. I hope you're not doing it for something like calculating the "next" value for a key...

./heLen