Subject | Re: [Firebird-Architect] Re: SMP - increasing single-query performance? |
---|---|
Author | Jim Starkey |
Post date | 2006-11-09T23:25:59Z |
Alexandre Benson Smith wrote:
are all sorts of interesting things that I'm doing in Falcon for the
update side. Updates in work this way:
1. The client thread performs updates in memory (records and indexes)
2. Blobs, however, are written to page images in the cache. When a
blob is complete, a separate thread, the page writer, starts
writing it.
3. At commit time, the updates (records and indexes) are written to a
serial log. When the page writer is done writing all blobs, a
commit record is written to the serial log, and the transaction is
considered committed.
4. Post commit, another thread, the gopher (go fer this, go fer that)
copies stuff (records and index updates) from the serial log to
page images.
5. Yet another thread, the system scheduler, fires of periodically to
checkpoint the page cache to the disk.
So data from a single transaction passes through a client thread, the
page writer, the gopher, and the scheduler. Letting updates run
unblocked in memory until commit, then waiting only for the basic bits
to hit the oxide, letting other threads migrate data to disk later leads
to incredible update performance.
Queries, however, are hard to parallelize. Better to have a large
record cache so they just go fast.
> I think that on normal operation a database server always serves moreWhile the opportunities for parallelism for queries are sparse, there
> than one query at a time, and this is sufficient to keep all processors
> running it's threads to serve each of those queries even if each query
> uses a single thread and does not run on parallel.
>
> The kind of scenario where a SMP server are serving just one "user"
> running a complex query that could bennefit from splitting this query
> into smaller parts and running each part on a diferent CPU is very rare
> IMHO.
>
are all sorts of interesting things that I'm doing in Falcon for the
update side. Updates in work this way:
1. The client thread performs updates in memory (records and indexes)
2. Blobs, however, are written to page images in the cache. When a
blob is complete, a separate thread, the page writer, starts
writing it.
3. At commit time, the updates (records and indexes) are written to a
serial log. When the page writer is done writing all blobs, a
commit record is written to the serial log, and the transaction is
considered committed.
4. Post commit, another thread, the gopher (go fer this, go fer that)
copies stuff (records and index updates) from the serial log to
page images.
5. Yet another thread, the system scheduler, fires of periodically to
checkpoint the page cache to the disk.
So data from a single transaction passes through a client thread, the
page writer, the gopher, and the scheduler. Letting updates run
unblocked in memory until commit, then waiting only for the basic bits
to hit the oxide, letting other threads migrate data to disk later leads
to incredible update performance.
Queries, however, are hard to parallelize. Better to have a large
record cache so they just go fast.