Subject Re: select count(*) took time
Author Adam
--- In, Hoang-Vu PHUNG
<hvu_phung@...> wrote:
> Hi Adam,
> Well, my table contains about more than 100 columns
> with the record size about 5 Kbytes. So it is rather
> big but not so big.
> I just use jaybird inside jython. I'm just evaluating
> firebird to see if it is a viable solution vs Oracle.
> My oracle database is about 15 Gigabytes. Is there any
> way to spread a firebird database over several files?
> Regarding my insertions,
> basically it is a prepared statement and then I
> execute over a loop and commit every 100 records.
> The code roughly is :
> sqlInsert = "insert into bigtable(a1,..,an) values
> (?,..?)"
> cursor = db.cursor()
> >> loop and commit every 100 records
> cursor.execute(sqlInsert, params).
> I'm still confused why select count(*) in firebird is
> too slow compared to other databases. Of course, your
> trick can help but imagine if you should do for every
> big tables... Again, request the size of a file is
> quite fast. Finally, scan a file of 200'000 rows to
> get the number of rows is also fast done, even in perl
> / java ...

Firebird databases are not files. Firebird tables are not files. A
Firebird database consists of what is called pages. Each page is about
4K (unless you change it to something else). Pages hold different
information, some hold data, some hold indices, some hold generators,
some hold blobs, etc etc. Firebird stores these pages together in 1 or
more files (from a file system perspective). Multi-file databases
isn't as exciting as you are imagining. It was really to work around
file system limitations regarding maximum file size (eg 2GB on Fat32).
Firebird plays with pages though, reading and writing the individual
pages to disk as required.

There are a number of things about your request that concern me. In an
ACID database such as Firebird or Oracle, the A stands for Atomic.
That means that a transaction either completely succeeds or fails.
There is no middle ground. Some databases use a logging mechanism to
track uncommitted work, and use these logs in the event of a system
failure to make sure the atomicity is maintained. Firebird writes the
differing record versions in the data page itself. When it writes a
record, it attaches your transaction ID to it. The record is written
in a careful order to make sure that it is impossible to end up with
corrupt data (see D[urability]). When a record is committed, its state
flags are changed in the Transaction Inventory Page (TIP). Other
transactions can now know whether or not to pretend that value does
not exist. Committing every 100 records therefore makes no sense,
unless each 100 records should be treated as a single atomic unit of
work. If some failure happens during your insert, you will be left
with a half full table.

You notice that the record now appears in the data page, whether or
not it is actually committed yet. It is also there when you delete it
until the garbage collection does its job.

When you run a select count(*), do you want to see uncommitted data?
What about deleted data? No. Indices are compressed and contain the
bare minimum of information, so much so that they are unidirectional.
This is normally a great advantage, because it means that the index
nodes are more dense. Higher Density = Less required index page
lookups = better performance. So you can't simply run through the
index node to work out the count. You need to actually visit the data
page. For each record you find, you have a new question. "Do I need to
pretend it doesn't exist?" For this, you need to look at the TIP to
see if it is committed. You also have to clean up the garbage if you
are able to, which also exagerates the count speed.

Counts on an unrestricted large table is almost always bad
programming. Never use count when you are testing for existence.