Subject | RE: [firebird-support] Re: Fbserver eating CPU...not taking advantage of memory |
---|---|
Author | Aaron Abend |
Post date | 2005-05-23T00:48:08Z |
Thanks Ann & Adam,
All of your comments are very useful.
Neither Oracle nor Sybase/MSSQL support more than one index per table
reference per query (in general). The biggest difference with Firebird is
that it will use more than one index. I am still trying to figure out how to
use this capability effectively.
Let's take a simple example: a table with 50,000 records which has a VALUE
and an ID (like a state table but much bigger). Sometimes, I will query on
the VALUE and look for a list of matching IDs that will be used to retrieve
VALUEs from another table by a join. In this case, I would index (VALUE, ID)
to drive the query from the VALUE to the ID, and possibly allow the query to
use just the index, since the index would contain all of the data required
for the query. In other queries, I would be using this same table as a
lookup to display appropriate values for a list of IDs, again through a
join. For this purpose, I put another index on the table, (ID, VALUE). The
cost of the two indexes is worth the price of inserts based on our
benchmarks and we get good performance in SQLServer. It seems that in
Firebird, singleton indexes on ID and VALUE might be best, since as point 4
below makes clear, database versioning would require the table to be read
anyway.
Is this analysis correct? I am going to try this approach and benchmark it
against the existing MSSQL optimized indexes. Now it is my understanding
that Firebird does not support clustered indexes. Is this true? I know a lot
about clustered indexes, which Sybase/MSSQL support and Oracle recently
began to support with index ordered tables. In general, I do NOT cluster on
primary keys because that is rarely the most common access path. Also,
clusters can be "too appealing" to the optimizer, which will scan a cluster
(inefficient) when a perfectly good non-clustered index would support a
faster retrieval.
If any of you wants to try the software, which we have just ported to
Firebird, please feel free to visit our website and download the beta at
www.viapoint.com/downloads/viapointbeta.exe. Viapoint helps people organize
files, emails, and web content in one place. (The current production
version, available at www.viapoint.com/downloads/viapointgds.exe, runs on an
embedded version of MSSQL).
Thanks again.
Aaron
-----Original Message-----
From: firebird-support@yahoogroups.com
[mailto:firebird-support@yahoogroups.com] On Behalf Of Ann W. Harrison
Sent: Sunday, May 22, 2005 1:57 PM
To: firebird-support@yahoogroups.com
Subject: Re: [firebird-support] Re: Fbserver eating CPU...not taking
advantage of memory
Aaron Abend wrote:
InterBase) and other databases.
1) Record location. Many databases cluster records on the primary key
index, either directly storing the data in the index or using the key to
group records. In a well-balanced system clustering on primary keys
makes primary key lookup very quick, though it leads to a wide fanout of
the index when data is stored in leaf nodes, and either sparse storage
or overflows when clustering - or with some value distributions both.
Firebird stores records on data pages, using the most accessible page
with sufficient space. Indexes are stored on index pages an contain a
record locater in the leaf node.
2) Index usage. Most databases systems read an index node, retrieve the
data - quick for clustered primary indexes, slower for secondary indexes
which often contain the primary key as the record locater, turning a
secondary index lookup into two index lookups. This technique also
leads to bouncing between index pages and data, which can be resolved by
proper placement control, assuming that the DBA has the time and skill
to do so. For non-clustered indexes this technique also results in
rereading data pages.
Firebird harvests the record locaters for qualifying records from the
index, builds a bitmap of record locaters, then reads the records in
physical storage order.
3) Index optimization. Because the access strategy binds index access
and record access tightly, most database optimizers must choose one
index per table as the path to data. Because it creates a bitmap of
record locaters, Firebird can use several indexes on a table by anding
and oring the bitmaps. At one point, it got a bit carried away and used
secondary indexes when unique index is available - that got fixed in 1.5.
4) Indexes in lieu of data. Non-versioning databases can resolve some
queries (counts for example) by reading the index without actually
reading the record data. Firebird (and Postgres and other natively
versioning databases) indexes contain entries that are not yet visible
to other transactions and entries that are no longer relevant to some
active transactions. The only way to know is to read the data.
5) Long duplicate chains. Some databases (Firebird 2 for one) are
better than others (Firebird 1.x for one) at removing data from long
(>10000) duplicate chains in indexes.
figure out why your CPU is off the scale.
Regards,
Ann
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Visit http://firebird.sourceforge.net and click the Resources item
on the main (top) menu. Try Knowledgebase and FAQ links !
Also search the knowledgebases at http://www.ibphoenix.com
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
_____
Yahoo! Groups Links
* To visit your group on the web, go to:
http://groups.yahoo.com/group/firebird-support/
* To unsubscribe from this group, send an email to:
firebird-support-unsubscribe@yahoogroups.com
<mailto:firebird-support-unsubscribe@yahoogroups.com?subject=Unsubscribe>
* Your use of Yahoo! Groups is subject to the Yahoo! Terms of
<http://docs.yahoo.com/info/terms/> Service.
[Non-text portions of this message have been removed]
All of your comments are very useful.
Neither Oracle nor Sybase/MSSQL support more than one index per table
reference per query (in general). The biggest difference with Firebird is
that it will use more than one index. I am still trying to figure out how to
use this capability effectively.
Let's take a simple example: a table with 50,000 records which has a VALUE
and an ID (like a state table but much bigger). Sometimes, I will query on
the VALUE and look for a list of matching IDs that will be used to retrieve
VALUEs from another table by a join. In this case, I would index (VALUE, ID)
to drive the query from the VALUE to the ID, and possibly allow the query to
use just the index, since the index would contain all of the data required
for the query. In other queries, I would be using this same table as a
lookup to display appropriate values for a list of IDs, again through a
join. For this purpose, I put another index on the table, (ID, VALUE). The
cost of the two indexes is worth the price of inserts based on our
benchmarks and we get good performance in SQLServer. It seems that in
Firebird, singleton indexes on ID and VALUE might be best, since as point 4
below makes clear, database versioning would require the table to be read
anyway.
Is this analysis correct? I am going to try this approach and benchmark it
against the existing MSSQL optimized indexes. Now it is my understanding
that Firebird does not support clustered indexes. Is this true? I know a lot
about clustered indexes, which Sybase/MSSQL support and Oracle recently
began to support with index ordered tables. In general, I do NOT cluster on
primary keys because that is rarely the most common access path. Also,
clusters can be "too appealing" to the optimizer, which will scan a cluster
(inefficient) when a perfectly good non-clustered index would support a
faster retrieval.
If any of you wants to try the software, which we have just ported to
Firebird, please feel free to visit our website and download the beta at
www.viapoint.com/downloads/viapointbeta.exe. Viapoint helps people organize
files, emails, and web content in one place. (The current production
version, available at www.viapoint.com/downloads/viapointgds.exe, runs on an
embedded version of MSSQL).
Thanks again.
Aaron
-----Original Message-----
From: firebird-support@yahoogroups.com
[mailto:firebird-support@yahoogroups.com] On Behalf Of Ann W. Harrison
Sent: Sunday, May 22, 2005 1:57 PM
To: firebird-support@yahoogroups.com
Subject: Re: [firebird-support] Re: Fbserver eating CPU...not taking
advantage of memory
Aaron Abend wrote:
>records
> We do all our testing on large real-life datasets with about 50,000
> after we do development on demo datasets. Having done database applicationbut
> development for 25 years, I have thought of most of the obvious errors,
> I do appreciate your ideas. It is the Firebird-specific stuff that Ireally
> need help on. My thinking right now is that our indexing, which drew fromThere are several differences in index handling between Firebird (and
> experience with other databases, is probably not optimal for Firebird.
InterBase) and other databases.
1) Record location. Many databases cluster records on the primary key
index, either directly storing the data in the index or using the key to
group records. In a well-balanced system clustering on primary keys
makes primary key lookup very quick, though it leads to a wide fanout of
the index when data is stored in leaf nodes, and either sparse storage
or overflows when clustering - or with some value distributions both.
Firebird stores records on data pages, using the most accessible page
with sufficient space. Indexes are stored on index pages an contain a
record locater in the leaf node.
2) Index usage. Most databases systems read an index node, retrieve the
data - quick for clustered primary indexes, slower for secondary indexes
which often contain the primary key as the record locater, turning a
secondary index lookup into two index lookups. This technique also
leads to bouncing between index pages and data, which can be resolved by
proper placement control, assuming that the DBA has the time and skill
to do so. For non-clustered indexes this technique also results in
rereading data pages.
Firebird harvests the record locaters for qualifying records from the
index, builds a bitmap of record locaters, then reads the records in
physical storage order.
3) Index optimization. Because the access strategy binds index access
and record access tightly, most database optimizers must choose one
index per table as the path to data. Because it creates a bitmap of
record locaters, Firebird can use several indexes on a table by anding
and oring the bitmaps. At one point, it got a bit carried away and used
secondary indexes when unique index is available - that got fixed in 1.5.
4) Indexes in lieu of data. Non-versioning databases can resolve some
queries (counts for example) by reading the index without actually
reading the record data. Firebird (and Postgres and other natively
versioning databases) indexes contain entries that are not yet visible
to other transactions and entries that are no longer relevant to some
active transactions. The only way to know is to read the data.
5) Long duplicate chains. Some databases (Firebird 2 for one) are
better than others (Firebird 1.x for one) at removing data from long
(>10000) duplicate chains in indexes.
>Yes, but lowering the priority is not going to fix anything. Lets
>
> Another thought - is there any way to reduce the priority of the Firebird
> process in the config file? Is there any harm to reducing its priority?
figure out why your CPU is off the scale.
Regards,
Ann
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Visit http://firebird.sourceforge.net and click the Resources item
on the main (top) menu. Try Knowledgebase and FAQ links !
Also search the knowledgebases at http://www.ibphoenix.com
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
_____
Yahoo! Groups Links
* To visit your group on the web, go to:
http://groups.yahoo.com/group/firebird-support/
* To unsubscribe from this group, send an email to:
firebird-support-unsubscribe@yahoogroups.com
<mailto:firebird-support-unsubscribe@yahoogroups.com?subject=Unsubscribe>
* Your use of Yahoo! Groups is subject to the Yahoo! Terms of
<http://docs.yahoo.com/info/terms/> Service.
[Non-text portions of this message have been removed]