Subject Re: [IB-Architect] Next ODS change (was: System table change)
Author Ann Harrison
>Ivan Prenosil wrote:
> > But increasing max index key size from 255 to 64K
> > seems like proper/long term solution to me.
> > Does it have any serious disadvantages ?

At 06:29 PM 11/14/2000 +0300, Dmitry Kuzmenko wrote:

>I think the main disadvangage will be index depth. I've made some calculations
>for some company's database.

>If page_size is 8K, and there is and index with average key length = 42,
>there will be
>about 6 million leaf pages on the index third level. ~6 million only for
>condition that
>pages does not have free spaces. It is not true in reality, and if key
>pages are filled
>only 50%, there will be only 2 million leaf pages on the third level.

Why would reducing the fill level by half reduce the number of
leaf buckets by two thirds? I would expect it to increase the
number by about a factor of two.

An average key length of 42 in a large table is unlikely unless the
key is very odd indeed. Prefix compression works very well.

There's a reasonably simple fix for half-filled index pages (other
than de/re/activating the index) which may be in V6 and certainly
could be introduced easily - not an ODS change.

>Let's do simple calculation for average key lenght = 256 characters (bytes).
>8K / 256 = 32 keys on page. I.e. index root page can point to 32 second
>level index pointer pages....
>After million keys (256 byte sized key, 100% filled key pages) index
>will get depth = 5. Performance will be completely down.

That's a very simplistic computation, ignoring, as it does, prefix and
suffix compression and the miscellaneous other stuff associated with a
key like the page pointer and length word. More to the point, having
the ability to create enormous keys is not a mandate to do so. If someone
wants to index a long character string - say index this whole message -
fine. We should be clear that longer keys lead to deeper indexes and
less performance.

The fact is that multi-byte collating sequences limit our current
keys to less than 100 characters. That's too small. We could remove
the definition time check and just hope that we can always compress
keys to fit ... didn't work last time ... or increase the size of a
key and make sure that people understand that there are costs to
long keys.

My bet? Most of the long keys will turn out to be fine once
compression has removed trailing blanks.