Subject | [firebird-support] Re: How to optimize this query using Firebird? |
---|---|
Author | Svein Erling Tysvær |
Post date | 2007-11-20T12:17:57Z |
Hi Timo!
Noticing that you haven't specified an ORDER BY clause, I assume you don't really care which 100 rows are selected. Moreover, I take it that you're not interested in duplicates. If so, I'd try something like:
SELECT FIRST 100
C2.DV_OBJECTTYPE, C2.DV_DOCUMENTPART, C2.DV_VERSIONPART
FROM DV_PV C2
WHERE
C2.PROPERTYDEF = 21
AND EXISTS(SELECT * FROM PV_TIMESTAMP C3
INNER JOIN DV_PV C4 ON
C4.DV_OBJECTTYPE = C2.DV_OBJECTTYPE
AND C4.DV_DOCUMENTPART = C2.DV_DOCUMENTPART
AND C4.DV_VERSIONPART = C2.DV_VERSIONPART
INNER JOIN PV_LOOKUP C5 ON C5.PROPERTYVALUE = C4.PROPERTYVALUE
WHERE
C3.PROPERTYVALUE = C2.PROPERTYVALUE
AND C3.PROPERTYDEF = 21
AND C3.DATA >= '24.12.2007'
AND C4.PROPERTYDEF = 23
AND C5.PROPERTYDEF = 23
AND C5.DATA = 101)
or possibly the same thing with C2 and C3 in the main select and C4 and C5 in the EXISTS (By the way, the above SELECT is the one case where an index on PROPERTYDEF could be useful).
I'm sorry to say that I've no experience with dynamic systems with that much data, so I've no good solution for you.
Set
-----Original Message-----
From: firebird-support@yahoogroups.com [mailto:firebird-support@yahoogroups.com] On Behalf Of partsi
Sent: 20. november 2007 10:33
To: firebird-support@yahoogroups.com
Subject: [firebird-support] Re: How to optimize this query using Firebird?
Svein,
Thanks for your support!
I would have one question left.
SELECT FIRST 100
C2.DV_OBJECTTYPE, C2.DV_DOCUMENTPART, C2.DV_VERSIONPART
FROM DV_PV C2
INNER JOIN PV_TIMESTAMP C3 ON C3.PROPERTYVALUE =
C2.PROPERTYVALUE
INNER JOIN DV_PV C4 ON
C4.DV_OBJECTTYPE = C2.DV_OBJECTTYPE
AND C4.DV_DOCUMENTPART = C2.DV_DOCUMENTPART
AND C4.DV_VERSIONPART = C2.DV_VERSIONPART
INNER JOIN PV_LOOKUP C5 ON C5.PROPERTYVALUE = C4.PROPERTYVALUE
WHERE
C2.PROPERTYDEF = 21
AND C3.PROPERTYDEF = 21
AND C3.DATA >= '24.12.2007'
AND C4.PROPERTYDEF = 23
AND C5.PROPERTYDEF = 23
AND C5.DATA = 101
Assume that both "C5.PROPERTYDEF = 23 AND C5.DATA = 101"
and "C3.PROPERTYDEF = 21 AND C3.DATA >= '24.12.2007'" yield a bad
selectivity in DV_PV. For example, the effective selectivity of both
C2.PROPERTYVALUE and C4.PROPERTYVALUE is 500 000 rows. It is possible
that the query does not produce any results and it is fairly slow.
There seems to be no way to get rid of this kind of slowness with
Firebird and the current database structure. Indexed views could help
but Firebird does not support them yet. Is the only way for
optimizing this to use a table like the following:
CREATE TABLE DV_PV (
DV_OBJECTTYPE INTEGER NOT NULL, // PK
DV_DOCUMENTPART INTEGER NOT NULL, // PK
DV_VERSIONPART INTEGER NOT NULL, // PK
LATEST SMALLINT NOT NULL, // PK
PROPERTY21_DATA TIMESTAMP,
PROPERTY23_DATA INTEGER,
.
.
.
PROPERTYN_DATA <datatype>
);
...and create a composite index for the data columns:
CREATE ASC INDEX IX_DV_PV_DATA21_DATA23
ON DV_PV ( PROPERTY21_DATA, PROPERTY23_DATA )
If there were only a few property definitions in our system, we could
use this kind of structure to store the properties of object
versions. Then we could create indexes for any pairs of data columns
to achieve optimum query performance in every case. But because our
system is dynamic with no limit to properties count, this structure
probably does not fit to our purpose.
What do you think? I would also be glad to here if someone else had a
suggestion or solution to this issue.
Timo
Noticing that you haven't specified an ORDER BY clause, I assume you don't really care which 100 rows are selected. Moreover, I take it that you're not interested in duplicates. If so, I'd try something like:
SELECT FIRST 100
C2.DV_OBJECTTYPE, C2.DV_DOCUMENTPART, C2.DV_VERSIONPART
FROM DV_PV C2
WHERE
C2.PROPERTYDEF = 21
AND EXISTS(SELECT * FROM PV_TIMESTAMP C3
INNER JOIN DV_PV C4 ON
C4.DV_OBJECTTYPE = C2.DV_OBJECTTYPE
AND C4.DV_DOCUMENTPART = C2.DV_DOCUMENTPART
AND C4.DV_VERSIONPART = C2.DV_VERSIONPART
INNER JOIN PV_LOOKUP C5 ON C5.PROPERTYVALUE = C4.PROPERTYVALUE
WHERE
C3.PROPERTYVALUE = C2.PROPERTYVALUE
AND C3.PROPERTYDEF = 21
AND C3.DATA >= '24.12.2007'
AND C4.PROPERTYDEF = 23
AND C5.PROPERTYDEF = 23
AND C5.DATA = 101)
or possibly the same thing with C2 and C3 in the main select and C4 and C5 in the EXISTS (By the way, the above SELECT is the one case where an index on PROPERTYDEF could be useful).
I'm sorry to say that I've no experience with dynamic systems with that much data, so I've no good solution for you.
Set
-----Original Message-----
From: firebird-support@yahoogroups.com [mailto:firebird-support@yahoogroups.com] On Behalf Of partsi
Sent: 20. november 2007 10:33
To: firebird-support@yahoogroups.com
Subject: [firebird-support] Re: How to optimize this query using Firebird?
Svein,
Thanks for your support!
I would have one question left.
SELECT FIRST 100
C2.DV_OBJECTTYPE, C2.DV_DOCUMENTPART, C2.DV_VERSIONPART
FROM DV_PV C2
INNER JOIN PV_TIMESTAMP C3 ON C3.PROPERTYVALUE =
C2.PROPERTYVALUE
INNER JOIN DV_PV C4 ON
C4.DV_OBJECTTYPE = C2.DV_OBJECTTYPE
AND C4.DV_DOCUMENTPART = C2.DV_DOCUMENTPART
AND C4.DV_VERSIONPART = C2.DV_VERSIONPART
INNER JOIN PV_LOOKUP C5 ON C5.PROPERTYVALUE = C4.PROPERTYVALUE
WHERE
C2.PROPERTYDEF = 21
AND C3.PROPERTYDEF = 21
AND C3.DATA >= '24.12.2007'
AND C4.PROPERTYDEF = 23
AND C5.PROPERTYDEF = 23
AND C5.DATA = 101
Assume that both "C5.PROPERTYDEF = 23 AND C5.DATA = 101"
and "C3.PROPERTYDEF = 21 AND C3.DATA >= '24.12.2007'" yield a bad
selectivity in DV_PV. For example, the effective selectivity of both
C2.PROPERTYVALUE and C4.PROPERTYVALUE is 500 000 rows. It is possible
that the query does not produce any results and it is fairly slow.
There seems to be no way to get rid of this kind of slowness with
Firebird and the current database structure. Indexed views could help
but Firebird does not support them yet. Is the only way for
optimizing this to use a table like the following:
CREATE TABLE DV_PV (
DV_OBJECTTYPE INTEGER NOT NULL, // PK
DV_DOCUMENTPART INTEGER NOT NULL, // PK
DV_VERSIONPART INTEGER NOT NULL, // PK
LATEST SMALLINT NOT NULL, // PK
PROPERTY21_DATA TIMESTAMP,
PROPERTY23_DATA INTEGER,
.
.
.
PROPERTYN_DATA <datatype>
);
...and create a composite index for the data columns:
CREATE ASC INDEX IX_DV_PV_DATA21_DATA23
ON DV_PV ( PROPERTY21_DATA, PROPERTY23_DATA )
If there were only a few property definitions in our system, we could
use this kind of structure to store the properties of object
versions. Then we could create indexes for any pairs of data columns
to achieve optimum query performance in every case. But because our
system is dynamic with no limit to properties count, this structure
probably does not fit to our purpose.
What do you think? I would also be glad to here if someone else had a
suggestion or solution to this issue.
Timo