Subject Re: [firebird-support] Composite Primary Key in JOIN - performance
Author Ann Harrison

On Aug 11, 2015, at 4:59 AM, steve.declerck@... [firebird-support] <firebird-support@yahoogroups.com> wrote:

...when I take a look at the small lookup tables used to store payment conditions, currency codes, titles, etc...., the primary key is always VARCHAR(4) to either store a generator number of a user defined CODE. But most of the time the value is just coming from a generator and with '0' as prefixes, like '0001', '0002'. It doesn't make much sense to me as the code is not shown anywhere. I guess the design would be better if a SMALLINT was used, am I right ? 2 Bytes instead of 8, correct ?

In general, fixed size small textual fields should be CHAR rather than VARCHAR.  Since the value stored is always the same length, the two bytes that describe the actual vs. declared length are wasted.  And yes, an integer type would be better if the content will always be numeric.  I have a personal problem with small int based on unpleasant experiences when they overflow, wrap around, and are generally a nuisance.  But as long as you're certain you'll never have more than 32,767 of them... 

The PROJECT table has a primary key VARCHAR(7) and more worse the SUBPROJECT table has composite primary key made up of the PROJECTNUMBER VARCHAR(7) AND A UNIQUE SUBPROJECTNUMBER for each project, VARCHAR(4).

So JOINs look like this :

JOIN PROJECT PR ON PR.PR_NR = <AColumn>

JOIN SUBPROJECT SU ON SU.SU_NR = PR.PR_NR AND SU.SU_SUB = <AColumn>

The quotation table has only 30.000 records and is not performing too good. That SUBPROJECT JOIN has to JOIN on 2 fields because of the composite primary key. VARCHAR(7) + (4) = 11, sounds like 22 bytes to me, instead of 4 bytes of an Integer.


Firebird's index key generator is sensitive to major datatypes and tries to minimize the key size by eliminating trailing blanks in strings - which won't exist in your case because the actual data is zero filled on the left, not blank filled on the right.  Numeric keys are represented as a mangled double precision float.  The mangling causes the number to compare correctly bytewise (moving and inverting the sign) and eliminates trailing zeros after the decimal point.  

Compound keys are handled so they also compare bytewise and should be reasonably quick.


So, yes, I do think you'd be better off with integer keys (skip the small int) but no, I don't think that's the problem with your queries.  Things that would help figure out why looking up stuff in a 30K record table is slow might include:  query plan, indexes on non-primary keys, query stats specifically reads vs. fetches, and probably other things I'm not thinking of now.

Generally, you're right that lying to Firebird by calling a number a variable length character string will lead to unhappiness...  just not this particular unhappiness.

Good luck,

Ann