Subject | Re: [firebird-support] Major query performance problems |
---|---|
Author | Ann W. Harrison |
Post date | 2010-12-08T18:43:43Z |
On 12/7/2010 1:31 PM, Alec Swan wrote:
unless you actually select more rows than the limit.
PHYSICAL_COPY 1,983,531 records.
COPY 2,170,227 records.
OK, as Svein said, you've got a field with two values ... and probably
a preponderance of one over the other. Firebird handles that case
badly. The work around, as he said, is to make the index look unusable
if you're selecting on the value that is vastly more common. For
character fields, concatenate with an empty string. For dates and
numbers add 0.
Unfortunately, it does look as if the problem is with the index on
COPY.SOURCE_ID. How many records in COPY have that value for
SOURCE_ID?
What other indexes do you have on those tables?
SELECT FIRST 1000 PHYSICAL_COPY."ID" AS COL0
FROM PHYSICAL_COPY INNER JOIN COPY
ON PHYSICAL_COPY."COPY_ID" = COPY."ID"
WHERE (PHYSICAL_COPY."IS_MARKED_DELETED" = 1)
AND (COPY."SOURCE_ID" = 'fb50a346-04e7-42d3-8782-1432001c74ad')
may be paying for loading a large cache as well as preparing and
running the query.
an index presumably on PHYSICAL_COPY.COPY_ID, for which we don't
have statistics.
an index, could be references to records - that are resolved
in cache. 16,347,133 is a lot.
Marks are changes to a page in cache. So this query in
addition to everything else is doing some garbage collection.
Reads are physical disk reads. 262,804 is larger than your
cache, so pages are being re-read.
Writes are physical disk writes - the pages that had
garbage collected are back on disk.
My guess is that this did run on a cold database because your
query should use only indexed lookups. The sequential look ups
are on small system tables. (I hope).
There's a fair amount of work involved in building the in-memory
versions of the system tables and, of course, you probably have
the upper levels of the indexes in cache plus pointer pages for
data.
SELECT FIRST 1000 PHYSICAL_COPY."ID" AS COL0
FROM PHYSICAL_COPY INNER JOIN COPY
ON PHYSICAL_COPY."COPY_ID" = COPY."ID"
WHERE (PHYSICAL_COPY."IS_MARKED_DELETED" = 1)
condition
COPY."SOURCE_ID" = 'fb50a346-04e7-42d3-8782-1432001c74ad'
was so unselective...
FROM PHYSICAL_COPY INNER JOIN COPY
ON PHYSICAL_COPY."COPY_ID" = COPY."ID"
WHERE (COPY."SOURCE_ID" = 'fb50a346-04e7-42d3-8782-1432001c74ad')
very different... Odd indeed.
Cheers,
Ann
>In general, using a "FIRST" will make your queries slower
> Note that the query returns no row.
unless you actually select more rows than the limit.
>I've got some inside information on the database ...
> The following were measured on Firebird 2.5.0.26074 using Flamerobin 0.9.3.1870.
>
>
> Statistics on COPY."SOURCE_ID" column index
> IDX_Lhi+/ZlWWZ/FDOab6YV2Vg== is 0.018519
> Statistics on PHYSICAL_COPY."IS_MARKED_DELETED" column index
> IDX_Lhi+/ZlWWZ/FDOab6YV2Vg== is 0.50000
PHYSICAL_COPY 1,983,531 records.
COPY 2,170,227 records.
OK, as Svein said, you've got a field with two values ... and probably
a preponderance of one over the other. Firebird handles that case
badly. The work around, as he said, is to make the index look unusable
if you're selecting on the value that is vastly more common. For
character fields, concatenate with an empty string. For dates and
numbers add 0.
Unfortunately, it does look as if the problem is with the index on
COPY.SOURCE_ID. How many records in COPY have that value for
SOURCE_ID?
What other indexes do you have on those tables?
SELECT FIRST 1000 PHYSICAL_COPY."ID" AS COL0
FROM PHYSICAL_COPY INNER JOIN COPY
ON PHYSICAL_COPY."COPY_ID" = COPY."ID"
WHERE (PHYSICAL_COPY."IS_MARKED_DELETED" = 1)
AND (COPY."SOURCE_ID" = 'fb50a346-04e7-42d3-8782-1432001c74ad')
> Prepare time: 1.203sDid you run this on a database that was just recently opened? You
may be paying for loading a large cache as well as preparing and
running the query.
> PLAN JOIN (COPY INDEX (IDX_Lhi+/ZlWWZ/FDOab6YV2Vg==),it's looking up COPY by a reasonably selective index and then using
> PHYSICAL_COPY INDEX (IDX_AlJS5EmMT9tODQnFqmid0w==))
an index presumably on PHYSICAL_COPY.COPY_ID, for which we don't
have statistics.
> 16347133 fetches, 4 marks, 262804 reads, 4 writes.Fetches are the number of references - could be references to
an index, could be references to records - that are resolved
in cache. 16,347,133 is a lot.
Marks are changes to a page in cache. So this query in
addition to everything else is doing some garbage collection.
Reads are physical disk reads. 262,804 is larger than your
cache, so pages are being re-read.
Writes are physical disk writes - the pages that had
garbage collected are back on disk.
> 0 inserts, 0 updates, 0 deletes, 4042375 index, 4289 seq.That's 4,042375 indexed lookups, and 4289 sequential look ups.
My guess is that this did run on a cold database because your
query should use only indexed lookups. The sequential look ups
are on small system tables. (I hope).
> Delta memory: 1423880 bytes.Right. That's awful.
> Total execution time: 0:30:22 (hh:mm:ss)
>Is this on the database after it's run the previous query.
> NOW THE SAME QUERY WITH ONE FILTER CONDITION:
There's a fair amount of work involved in building the in-memory
versions of the system tables and, of course, you probably have
the upper levels of the indexes in cache plus pointer pages for
data.
SELECT FIRST 1000 PHYSICAL_COPY."ID" AS COL0
FROM PHYSICAL_COPY INNER JOIN COPY
ON PHYSICAL_COPY."COPY_ID" = COPY."ID"
WHERE (PHYSICAL_COPY."IS_MARKED_DELETED" = 1)
> PLAN JOIN (PHYSICAL_COPY INDEX (IDX_sKfIZDMkd26ITRo1Da5mzQ==),OK, that's a lot better. I would not have guessed that the
> COPY INDEX (PK_ZM6SRonqR8AHSQuCISgvnQ==))
> 186 fetches, 0 marks, 50 reads, 0 writes.
> 0 inserts, 0 updates, 0 deletes, 24 index, 0 seq.
> Delta memory: -1423168 bytes.
> Total execution time: 3.297s
condition
COPY."SOURCE_ID" = 'fb50a346-04e7-42d3-8782-1432001c74ad'
was so unselective...
>SELECT FIRST 1000 PHYSICAL_COPY."ID" AS COL0
> NOW THE SAME QUERY WITH THE OTHER FILTER CONDITION:
FROM PHYSICAL_COPY INNER JOIN COPY
ON PHYSICAL_COPY."COPY_ID" = COPY."ID"
WHERE (COPY."SOURCE_ID" = 'fb50a346-04e7-42d3-8782-1432001c74ad')
>OK, that's interesting. There's something very not linear here.
> Prepare time: 0.094s
>
> PLAN JOIN (COPY INDEX (IDX_Lhi+/ZlWWZ/FDOab6YV2Vg==),
> PHYSICAL_COPY INDEX (IDX_AlJS5EmMT9tODQnFqmid0w==))
>
> 2273 fetches, 0 marks, 1293 reads, 0 writes.
> 0 inserts, 0 updates, 0 deletes, 225 index, 0 seq.
> Delta memory: 1422708 bytes.
> Total execution time: 23.624s
> PLAN JOIN (COPY INDEX (IDX_Lhi+/ZlWWZ/FDOab6YV2Vg==),That's the same plan as the first query, and yet the results are
> PHYSICAL_COPY INDEX (IDX_AlJS5EmMT9tODQnFqmid0w==))
very different... Odd indeed.
Cheers,
Ann