Subject Re: [ib-support] collate & index
Author Ann W. Harrison
At 01:15 PM 11/28/2001 +0000, Fabrice Aeschbacher wrote:

>CREATE TABLE T ( ID INTEGER
>, FIRST_NAME VARCHAR ( 40 ) COLLATE EN_US
>, LAST_NAME VARCHAR ( 40 ) COLLATE EN_US);
>
>CREATE INDEX IDX_NAME ON T ( FIRST_NAME, LAST_NAME );
>
>the CREATE INDEX fails with the following error:
>
> ISC ERROR CODE:335544351
> ISC ERROR MESSAGE:
> unsuccessful metadata update
> key size too big for index IDX_CARDHOLDER_NAME
>
>What is the actual max key size for indexes?

It varies slightly depending on the number of key segments.
For single key indexes, it's 252. For each additional segment,
add subtract 4 bytes, then subtract the total size of the key
divided by 5.

>And since the default charset is ISO8859_1, the char size should
>still be equal to 1. So why did the CREATE INDEX statement fail?

The index key is not the same as the field it represents. All
numeric keys are eight bytes as are timestamps. For efficiency,
index keys are setup to compare bytewise - as soon as the first
different byte is found, the comparison is over. As a result
any character collation other than the default involves additional
characters.

Here's a description from Dave Schnepper who actually understands
all this.

The InterBase collation orders for ISO8859 (such as SV_SV) follow a full
linguistic (eg: dictionary) collation order. In such a collation order
spaces (and other punctuation marks) are of 4th level importance.

First order: A is different than B
2nd order: A is different from A-accent-grave
3rd order: A is different than a
4th order: The type of punctuation mark is important.

For instance:
Redwing
Red wing
Red-wing
Redwood
Red wood
Red worm

If spaces (& other punctuation) are treated as a first order difference the
list
becomes sorted as
Redwing
Redwood
Red wing
Red wood
Red worm
Red-wing

Which may be desirable, but isn't a dictionary sort.


Additional bytes are added to represent second and higher level
orders.





Regards,

Ann
www.ibphoenix.com
We have answers.