firebird-architect - Some info about IB internals given by Charlie Caro at BorCon

Subject	Some info about IB internals given by Charlie Caro at BorCon
Author	Pavel Cisar
Post date	2003-11-06T11:31:23Z

Hi all,

There are some interesting improvements in IB 7.x, and plans for future
improvements. I wonder if we can use some. Borland harvested many ideas
from us, so why we can't do the same ?

Next points are taken from article written by Craig Stunz, available at
http://delphi.weblogs.com/stories/storyReader

- For example, commit retaining in read committed transactions refresh
the snapshot number. Behaviour of snapshot transactions is (naturaly)
still te same. Not a panacea to bad-writen application with long-running
transactions, but it sure can help.

- In response to a question from the audience regarding server-wide
performance monitoring (right now performance monitoring works for one
database at a time), Charlie said that this is planned, and that the idea
is that you would attach to admin.ib it will produce an aggregate of all
attached DBs. This allows you to set performance monitoring security
using standard SQL GRANT statements, just as it is done today. If you
look at the schema definitions for the performance monitoring tables you
can see that this idea is already in the metadata; it just isn't
implemented yet.

- InterBase 7 made the InterBase client thread-safe, but concurrent
access to the client within a single connection is serialized.
Serialization of concurrent threads in the client is due to wanting to
avoid synchronization in the server; this would have added serious
overhead to single-threaded applications if allowed.

- Due to its fine-grained multithreading, InterBase 7 is quite a bit
better, to say the least, than InterBase 6 in the area of not letting
computationally intensive queries block simple queries from other
connections. However, thread synchronization requirements can make the
engine run a bit slower when only one connection is working at a time.
InterBase 7.1 reduces synchronization points in the engine and thread
scheduler, meaning less overhead due to fine-grained multithreading
support. 7.1 also fixes numerous SMP support bugs which could result in
thread scheduler hangs.

- The ibconfig parameter MAX_THREADS is increased to 1,000,000 by default
(I can't remember if this is true for all installations or only for an
SMP system). This means that there are never any threads in the 7.1
scheduler wait queue. This change was due to testing results; the higher
setting appears to work better. Reduced bottlenecks in LRU page queue.
Memory pool operations performed with less synchronization.

- Garbage collection was moved into a background psuedo-thread for
InterBase 6. With InterBase 7 garbage collection runs in a real thread.
For InterBase 7.1 the garbage collector was significantly enhanced to
reduce the amount of work it must do to do its job.

- One of the biggest problems with garbage collection in the past is that
it makes the engine behave anomalously -- the same query might run
quickly or slowly depending upon what garbage collection is doing. To
reduce this the InterBase team has eliminated anomalous GC behavior with
more scalable algorithms. They also added boundaries to the collector's
use of CPU and memory resources. The goal is to reduce GC impact on
foreground queries, so the collector's behavior will change depending
upon whether or not the engine was idle. Right now there are essentially
three states for the garbage collector:

1) Idle
Either there is nothing to collect or no user is connected to the
database. InterBase does not garbage collect when nobody is connected
since internally the engine presumes that it should disconnect from the
database file when nobody is connected.

2) Running with Interruptions
When the engine is working on user queries and SWEEP_YIELD_TIME in
ibconfig is nonzero, the sweep will run but will yield every now and
again to ensure that it doesn't slow foreground queries.

3) Running uninterrupted
When users are connected but not running queries, or when the engine is
working on user queries and SWEEP_YIELD_TIME is 0, then the sweep will
run continuously in its thread. This means that the sweep can slow down
foreground queries. However, this "loss" may be relatively small in
comparison with the consequences of allowing old record versions to
accumulate faster than the collector can clean them up. In particular, if
you plan on doing high-volume OLTP and are finding that the system slows
down over time even though there are no long-running transactions you may
wish to try this setting. If there are N connections, the garbage
collection thread will only get about 1/N of server time anyway because
of round-robin scheduler. In this state the server should be running at
100% CPU use -- the full power of the server is used to service user
queries and collect old versions.

- In the future, SWEEP_YIELD_TIME will probably be set automatically
based on the amount of garbage created.

- TPC-B MTBF benchmark revealed that back version chains growing
unchecked even though all transactions were sub-second duration. A number
of fixes were implemented to reduce the length of these back version
chains in some cases and reduce the server's need to traverse them in
other cases.

- A number of fixes were implemented which reduce the amount of work the
server must do to maintain indices on non-unique columns. [ no more
details ]

- After a user performed a DELETE, garbage collection of those committed
deletes would not happen until another user revisited those rows. This
resulted in wasted space and sometimes poor performance. Technically
saavy InterBase developers would sometimes work around this by performing
a SELECT COUNT(*) against the table after a DELETE and COMMIT, but this
didn't always work (depending upon the transaction isolation of other
users). The solution is that when a user COMMITs a DELETE the server
spools off those pages with deleted records to the GC thread, when it
thinks there is no concurrent snapshot looking that data. Again, the
garbage collector only runs when users are connected to the database, so
the last DB detach can terminate the GC thread before the GC is complete.
If you're in the habit of doing a SELECT COUNT(*) after a large DELETE,
in other words, this should no longer be necessary; just stay connected
until the GC has finished.

Some info about InterBase 7.1 Service Pack 1.

- This is an update to address issues found by new GC algorithms. It also
includes additional work to add improvements that slipped the original
7.1 deadline.

- InterBase now uses more server memory when doing in-memory sorts.
Versions of InterBase before 7.1 used to have a static sort memory buffer
of approx. 128 KB. This is now configurable per server using a new
ibconfig parameter called SORTMEM_BUFFER_SIZE.. The default value is just
about 1 MB. This parameter specifies the size of a sort buffer in memory.
Setting this to a higher value will enable better performance in large
sort-merge queries.

- InterBase's optimizer used to consider indices with most keys first,
even if they weren't the best fit for conjuncts (Booleans) in the query.
Fixed in service pack. [ I wonder where this fix come from ? :-)) ]

- The team added SQL exception logging. If a request is running when the
exception is raised then either the user's SQL string or the proc/trigger
name will appear in interbase.log.

- The limit for the point at which the top-level transaction savepoint is
discarded has been increased in Service Pack 1 from 10,000 to 100,000.
This change should increase stability when updating more than 10,000 rows
at a time.

- Workaround for afd.sys Memory Leak: DUMMY_PACKET_INTERVAL in ibconfig
is changing from 60 seconds to 0, disabling dummy packets. SO_KEEPALIVE
is still set on a socket but TCP/IP may not transmit this option for
hour(s) depending upon its setting in the Windows registry. You can
change this, if you like, but keep in mind that it affects all
applications, not just InterBase. Since dummy packets are disabled by
default in the service pack, the InterBase team introduced a new feature
in the performance monitoring tables to allow a keep alive packet to be
sent manually. Just execute UPDATE TMP$ATTACHMENTS SET TMP$SSTATE =
'KEEPALIVE' WHERE [some condition] and COMMIT. This sends one dummy
packet to each remote connection specified by the where clause but does
not continue sending any more dummy packets after that.

So, what's your opinion ?

Best regards
Pavel Cisar