Subject | Re: How to optimize this query using Firebird? |
---|---|

Author | partsi |

Post date | 2007-11-19T14:08:21Z |

Svein,

Thanks for your response.

I am using Firebird 2.1.0.17176. The actual build is not critical

because the same performance problems appear with Firebird 2.0.3 and

2.1 Beta 2.

Query statistics:

DV_PV: Indexed: 2 008 092

PV_TIMESTAMP: Indexed: 1 004 046

PV_LOOKUP: Indexed: 1

Rows in PV_TIMESTAMP: 3 011 008

PROPERTYDEF = 21 rows in PV_TIMESTAMP: 1 004 435

Rows in PV_LOOKUP: 197

Rows in DV_PV: 26 227 652

Distinct PROPERTYVALUE in DV_PV: 3 021 017

Distinct PROPERTYVALUE, LATEST in DV_PV: 3 022 565

Distinct PROPERTYVALUE, PROPERTYDEF, LATEST in DV_PV: 3 022 565

Distinct PROPERTYDEF in DV_PV: 47

Distinct DV_OBJECTTYPE, DV_DOCUMENTPART, DV_VERSIONPART in DV_PV: 2

006 872

RDB$RELATION_NAME RDB$INDEX_NAME RDB$STATISTICS

--------------------------------------------------------------------

PV_TIMESTAMP PK_PV_TIMESTAMP 3,321146948565E-7

PV_TIMESTAMP IX_PV_TIMESTAMP_PD_DATA 3,321146948565E-7

PV_LOOKUP PK_PV_LOOKUP 0,00507614202797413

PV_LOOKUP IX_PV_LOOKUP_PD_DATA_TARGETVER 0,00507614202797413

DV_PV IX_DV_PV_PV 3,30844841300859E-7

DV_PV PK_DV_PV 3,81276983318912E-8

C2.PROPERTYVALUE or C4.PROPERTYVALUE? The optimizer has no way of

knowing this, it's only you who may know...

This is exactly the case. If C2.PROPERTYVALUE is more selective than

C4.PROPERTYVALUE, I want the query to start from C3, not from C5.

CREATE ASC INDEX IX_DV_PV_PV ON DV_PV ( PROPERTYVALUE, PROPERTYDEF,

LATEST )

Actually adding an index for PROPERTYDEF or modifying IX_DV_PV_PV to

include PROPERTYDEF does not help. Each value (DATA) for a given

property (PROPERTYDEF) is stored only once in the data tables (e.g.

PV_TIMESTAMP, PV_LOOKUP), and if multiple object versions refer to

this value, they use the same PROPERTYVALUE key in DV_PV. Therefore,

adding PROPERTYDEF to IX_DV_PV_PV does not make that index more

selective. This fact is also seen from the distinct row counts I

included.

prevent the optimizer from making stupid choices by adding +0 to some

of your conditions. In your query, that would mean something like

C4.DV_OBJECTTYPE = C2.DV_OBJECTTYPE+0 if you were certain (or wanted

to test) that starting with C3 would be a better choice.

The trick "C4.DV_OBJECTTYPE = C2.DV_OBJECTTYPE+0" really helped. I

had already tested tricks "C5.DATA + 0 = 101" and "C5.PROPERTYDEF + 0

= 23 AND C5.DATA + 0 = 101" to disable using the index

IX_PV_LOOKUP_PD_DATA_TARGETVER but in the first case Firebird still

used that index, and the second trick caused a natural scan on C5:

PLAN JOIN (C5 NATURAL, C4 INDEX (IX_DV_PV_PV), C2 INDEX (PK_DV_PV),

C3 INDEX (PK_PV_TIMESTAMP))

This is again a bad plan for my specific case. However, the trick you

suggested advices Firebird to start from C2 using the following plan

that results in a good performance:

PLAN JOIN (C3 INDEX (IX_PV_TIMESTAMP_PD_DATA), C2 INDEX

(IX_DV_PV_PV), C4 INDEX (PK_DV_PV), C5 INDEX (PK_PV_LOOKUP))

Is your trick a reliable way to advice the optimizer?

Thanks,

Timo

Thanks for your response.

I am using Firebird 2.1.0.17176. The actual build is not critical

because the same performance problems appear with Firebird 2.0.3 and

2.1 Beta 2.

Query statistics:

DV_PV: Indexed: 2 008 092

PV_TIMESTAMP: Indexed: 1 004 046

PV_LOOKUP: Indexed: 1

Rows in PV_TIMESTAMP: 3 011 008

PROPERTYDEF = 21 rows in PV_TIMESTAMP: 1 004 435

Rows in PV_LOOKUP: 197

Rows in DV_PV: 26 227 652

Distinct PROPERTYVALUE in DV_PV: 3 021 017

Distinct PROPERTYVALUE, LATEST in DV_PV: 3 022 565

Distinct PROPERTYVALUE, PROPERTYDEF, LATEST in DV_PV: 3 022 565

Distinct PROPERTYDEF in DV_PV: 47

Distinct DV_OBJECTTYPE, DV_DOCUMENTPART, DV_VERSIONPART in DV_PV: 2

006 872

RDB$RELATION_NAME RDB$INDEX_NAME RDB$STATISTICS

--------------------------------------------------------------------

PV_TIMESTAMP PK_PV_TIMESTAMP 3,321146948565E-7

PV_TIMESTAMP IX_PV_TIMESTAMP_PD_DATA 3,321146948565E-7

PV_LOOKUP PK_PV_LOOKUP 0,00507614202797413

PV_LOOKUP IX_PV_LOOKUP_PD_DATA_TARGETVER 0,00507614202797413

DV_PV IX_DV_PV_PV 3,30844841300859E-7

DV_PV PK_DV_PV 3,81276983318912E-8

>selectivity of C3) which of the following is the most selective:

>the ideal, I would guess boils down to (in addition to the

C2.PROPERTYVALUE or C4.PROPERTYVALUE? The optimizer has no way of

knowing this, it's only you who may know...

This is exactly the case. If C2.PROPERTYVALUE is more selective than

C4.PROPERTYVALUE, I want the query to start from C3, not from C5.

>query. Either add a separate index for PROPERDYDEF or change to

> Though there is one thing I would recommend for this particular

CREATE ASC INDEX IX_DV_PV_PV ON DV_PV ( PROPERTYVALUE, PROPERTYDEF,

LATEST )

Actually adding an index for PROPERTYDEF or modifying IX_DV_PV_PV to

include PROPERTYDEF does not help. Each value (DATA) for a given

property (PROPERTYDEF) is stored only once in the data tables (e.g.

PV_TIMESTAMP, PV_LOOKUP), and if multiple object versions refer to

this value, they use the same PROPERTYVALUE key in DV_PV. Therefore,

adding PROPERTYDEF to IX_DV_PV_PV does not make that index more

selective. This fact is also seen from the distinct row counts I

included.

>understand which table condition would be ideal to start with and

> With experience and intimate knowledge about your data, you could

prevent the optimizer from making stupid choices by adding +0 to some

of your conditions. In your query, that would mean something like

C4.DV_OBJECTTYPE = C2.DV_OBJECTTYPE+0 if you were certain (or wanted

to test) that starting with C3 would be a better choice.

The trick "C4.DV_OBJECTTYPE = C2.DV_OBJECTTYPE+0" really helped. I

had already tested tricks "C5.DATA + 0 = 101" and "C5.PROPERTYDEF + 0

= 23 AND C5.DATA + 0 = 101" to disable using the index

IX_PV_LOOKUP_PD_DATA_TARGETVER but in the first case Firebird still

used that index, and the second trick caused a natural scan on C5:

PLAN JOIN (C5 NATURAL, C4 INDEX (IX_DV_PV_PV), C2 INDEX (PK_DV_PV),

C3 INDEX (PK_PV_TIMESTAMP))

This is again a bad plan for my specific case. However, the trick you

suggested advices Firebird to start from C2 using the following plan

that results in a good performance:

PLAN JOIN (C3 INDEX (IX_PV_TIMESTAMP_PD_DATA), C2 INDEX

(IX_DV_PV_PV), C4 INDEX (PK_DV_PV), C5 INDEX (PK_PV_LOOKUP))

Is your trick a reliable way to advice the optimizer?

Thanks,

Timo