firebird-support - Re: [firebird-support] Re: Firebird and sharding ?

Subject	Re: [firebird-support] Re: Firebird and sharding ? - Email found in subject
Author	Ann Harrison
Post date	2012-03-29T16:21:59Z

Norm,

> > yes i understand, but now with my 50 millions rows table i start to meet
> the limit of firebird where a simple prepare can take around 1 s to 1 min
> dependantly the charge of the server (see my previous post). next year it's
> will be around 100 millions rows and i will have no solutions ... this why
> i start to thing about sharding in an easy way, in a way out in fact
>
> Now I'm not 100% sure what preparing a statement on Firebird
> should take so long on bigger tables and I can see how, with the present
> state of things, that that will be a problem for you.
>
>

When Norm says he isn't quite sure about something, I have to assume that
lots of people are also in the dark. The performance problem in preparing
queries comes from the algorithm Firebird uses to estimate the cardinality
(number of records) of a table. In deciding how to execute a query,
Firebird considers the cardinality of each table involved and the
selectivity of each index that could be used. Firebird keeps the
selectivity in the index system table, updating it when the index is
recreated or when somebody says "set selectivity." However, the
cardinality is computed for each query.

(That's not as odd as it seems. The distribution of key values is unlikely
to change (much) after the initial batch of data is stored, but the number
of record in a table changes often. Jim and I had a bit of experience with
a database that actually stored the number of records in a table in its
system tables - horrible hot spot that consumed a significant fraction of
the cpu time.)

To understand how Firebird calculates the approximate cardinality of a
table, you need to understand a little bit about how records are found.
The system table RDB$PAGES contains records that give the page number for
pointer pages for a table. The pointer page contains an array of data page
numbers. To find a record, Firebird first decomposes the record number
into three values:

1) the ordinal position of the pointer page in RDB$PAGES (think "select
page_number from rdb$pages where table = <my table> and type = 'Pointer
Page' and position = 1", then the same with position = 2, etc.)

2) the offset in the array of data pages on that pointer page.

3) is the index into an array of offset/length pairs on the data page that
locate the actual record.

OK? Read RDB$PAGES to find pointer page, index into pointer page to find
data page, index into data page to find offset and length of record.

It was clear, even those early days in the dark ages of computing that
counting the actual records to get the cardinality would be a disaster.
But, the number of data pages gives a pretty good approximation. Divide
the page size by an approximation of the record size to get the number of
records per page, then multiply that by the number of data pages and
there's your estimated cardinality. Back then, disks were expensive and
tables were small, so a big table might have three or four pointer pages,
each pointing to about 120 data pages. At that size, knowing whether a
pointer page is full (~124 pages on a 1K page ... remember this was a long
time ago) or just started and containing only one data page. So actually
reading the pointer pages was important. That makes some sense when you've
got maybe as many as a dozen pointer page. With 50 million records, you've
got more than a thousand pointer pages. Reading all of them takes time, and
probably isn't all that much more accurate than just estimating the number
of data pages based on the number of pointer pages, just as it now
estimates the number of records based on the number of data pages.

I have no idea how V3 handles the estimate of cardinality, but one way to
reduce the cost for large tables is to read the pointer pages only if there
are relatively few of them, and for large tables guess based on the number
of entries in RDB$PAGES.

Good luck,

Ann

[Non-text portions of this message have been removed]