Subject | Re: How to optimize this query using Firebird? |
---|---|
Author | partsi |
Post date | 2007-11-19T14:08:21Z |
Svein,
Thanks for your response.
I am using Firebird 2.1.0.17176. The actual build is not critical
because the same performance problems appear with Firebird 2.0.3 and
2.1 Beta 2.
Query statistics:
DV_PV: Indexed: 2 008 092
PV_TIMESTAMP: Indexed: 1 004 046
PV_LOOKUP: Indexed: 1
Rows in PV_TIMESTAMP: 3 011 008
PROPERTYDEF = 21 rows in PV_TIMESTAMP: 1 004 435
Rows in PV_LOOKUP: 197
Rows in DV_PV: 26 227 652
Distinct PROPERTYVALUE in DV_PV: 3 021 017
Distinct PROPERTYVALUE, LATEST in DV_PV: 3 022 565
Distinct PROPERTYVALUE, PROPERTYDEF, LATEST in DV_PV: 3 022 565
Distinct PROPERTYDEF in DV_PV: 47
Distinct DV_OBJECTTYPE, DV_DOCUMENTPART, DV_VERSIONPART in DV_PV: 2
006 872
RDB$RELATION_NAME RDB$INDEX_NAME RDB$STATISTICS
--------------------------------------------------------------------
PV_TIMESTAMP PK_PV_TIMESTAMP 3,321146948565E-7
PV_TIMESTAMP IX_PV_TIMESTAMP_PD_DATA 3,321146948565E-7
PV_LOOKUP PK_PV_LOOKUP 0,00507614202797413
PV_LOOKUP IX_PV_LOOKUP_PD_DATA_TARGETVER 0,00507614202797413
DV_PV IX_DV_PV_PV 3,30844841300859E-7
DV_PV PK_DV_PV 3,81276983318912E-8
C2.PROPERTYVALUE or C4.PROPERTYVALUE? The optimizer has no way of
knowing this, it's only you who may know...
This is exactly the case. If C2.PROPERTYVALUE is more selective than
C4.PROPERTYVALUE, I want the query to start from C3, not from C5.
CREATE ASC INDEX IX_DV_PV_PV ON DV_PV ( PROPERTYVALUE, PROPERTYDEF,
LATEST )
Actually adding an index for PROPERTYDEF or modifying IX_DV_PV_PV to
include PROPERTYDEF does not help. Each value (DATA) for a given
property (PROPERTYDEF) is stored only once in the data tables (e.g.
PV_TIMESTAMP, PV_LOOKUP), and if multiple object versions refer to
this value, they use the same PROPERTYVALUE key in DV_PV. Therefore,
adding PROPERTYDEF to IX_DV_PV_PV does not make that index more
selective. This fact is also seen from the distinct row counts I
included.
prevent the optimizer from making stupid choices by adding +0 to some
of your conditions. In your query, that would mean something like
C4.DV_OBJECTTYPE = C2.DV_OBJECTTYPE+0 if you were certain (or wanted
to test) that starting with C3 would be a better choice.
The trick "C4.DV_OBJECTTYPE = C2.DV_OBJECTTYPE+0" really helped. I
had already tested tricks "C5.DATA + 0 = 101" and "C5.PROPERTYDEF + 0
= 23 AND C5.DATA + 0 = 101" to disable using the index
IX_PV_LOOKUP_PD_DATA_TARGETVER but in the first case Firebird still
used that index, and the second trick caused a natural scan on C5:
PLAN JOIN (C5 NATURAL, C4 INDEX (IX_DV_PV_PV), C2 INDEX (PK_DV_PV),
C3 INDEX (PK_PV_TIMESTAMP))
This is again a bad plan for my specific case. However, the trick you
suggested advices Firebird to start from C2 using the following plan
that results in a good performance:
PLAN JOIN (C3 INDEX (IX_PV_TIMESTAMP_PD_DATA), C2 INDEX
(IX_DV_PV_PV), C4 INDEX (PK_DV_PV), C5 INDEX (PK_PV_LOOKUP))
Is your trick a reliable way to advice the optimizer?
Thanks,
Timo
Thanks for your response.
I am using Firebird 2.1.0.17176. The actual build is not critical
because the same performance problems appear with Firebird 2.0.3 and
2.1 Beta 2.
Query statistics:
DV_PV: Indexed: 2 008 092
PV_TIMESTAMP: Indexed: 1 004 046
PV_LOOKUP: Indexed: 1
Rows in PV_TIMESTAMP: 3 011 008
PROPERTYDEF = 21 rows in PV_TIMESTAMP: 1 004 435
Rows in PV_LOOKUP: 197
Rows in DV_PV: 26 227 652
Distinct PROPERTYVALUE in DV_PV: 3 021 017
Distinct PROPERTYVALUE, LATEST in DV_PV: 3 022 565
Distinct PROPERTYVALUE, PROPERTYDEF, LATEST in DV_PV: 3 022 565
Distinct PROPERTYDEF in DV_PV: 47
Distinct DV_OBJECTTYPE, DV_DOCUMENTPART, DV_VERSIONPART in DV_PV: 2
006 872
RDB$RELATION_NAME RDB$INDEX_NAME RDB$STATISTICS
--------------------------------------------------------------------
PV_TIMESTAMP PK_PV_TIMESTAMP 3,321146948565E-7
PV_TIMESTAMP IX_PV_TIMESTAMP_PD_DATA 3,321146948565E-7
PV_LOOKUP PK_PV_LOOKUP 0,00507614202797413
PV_LOOKUP IX_PV_LOOKUP_PD_DATA_TARGETVER 0,00507614202797413
DV_PV IX_DV_PV_PV 3,30844841300859E-7
DV_PV PK_DV_PV 3,81276983318912E-8
>selectivity of C3) which of the following is the most selective:
>the ideal, I would guess boils down to (in addition to the
C2.PROPERTYVALUE or C4.PROPERTYVALUE? The optimizer has no way of
knowing this, it's only you who may know...
This is exactly the case. If C2.PROPERTYVALUE is more selective than
C4.PROPERTYVALUE, I want the query to start from C3, not from C5.
>query. Either add a separate index for PROPERDYDEF or change to
> Though there is one thing I would recommend for this particular
CREATE ASC INDEX IX_DV_PV_PV ON DV_PV ( PROPERTYVALUE, PROPERTYDEF,
LATEST )
Actually adding an index for PROPERTYDEF or modifying IX_DV_PV_PV to
include PROPERTYDEF does not help. Each value (DATA) for a given
property (PROPERTYDEF) is stored only once in the data tables (e.g.
PV_TIMESTAMP, PV_LOOKUP), and if multiple object versions refer to
this value, they use the same PROPERTYVALUE key in DV_PV. Therefore,
adding PROPERTYDEF to IX_DV_PV_PV does not make that index more
selective. This fact is also seen from the distinct row counts I
included.
>understand which table condition would be ideal to start with and
> With experience and intimate knowledge about your data, you could
prevent the optimizer from making stupid choices by adding +0 to some
of your conditions. In your query, that would mean something like
C4.DV_OBJECTTYPE = C2.DV_OBJECTTYPE+0 if you were certain (or wanted
to test) that starting with C3 would be a better choice.
The trick "C4.DV_OBJECTTYPE = C2.DV_OBJECTTYPE+0" really helped. I
had already tested tricks "C5.DATA + 0 = 101" and "C5.PROPERTYDEF + 0
= 23 AND C5.DATA + 0 = 101" to disable using the index
IX_PV_LOOKUP_PD_DATA_TARGETVER but in the first case Firebird still
used that index, and the second trick caused a natural scan on C5:
PLAN JOIN (C5 NATURAL, C4 INDEX (IX_DV_PV_PV), C2 INDEX (PK_DV_PV),
C3 INDEX (PK_PV_TIMESTAMP))
This is again a bad plan for my specific case. However, the trick you
suggested advices Firebird to start from C2 using the following plan
that results in a good performance:
PLAN JOIN (C3 INDEX (IX_PV_TIMESTAMP_PD_DATA), C2 INDEX
(IX_DV_PV_PV), C4 INDEX (PK_DV_PV), C5 INDEX (PK_PV_LOOKUP))
Is your trick a reliable way to advice the optimizer?
Thanks,
Timo