firebird-support - Re: Primary Key of character a problem?

Subject	Re: Primary Key of character a problem?
Author	rjschappe
Post date	2003-07-23T06:53:11Z

> >Does anyone know if using a PK's of char(20) will cause a problem?
> >Our GUID routines creates a 20 char [0..9] sequence.
> >
> >
> No, that should be fine, with one caution.Firebird
> compresses the prefix of index keys so you'll get
> much better key density if you order the bytes from
> the most likly to be duplicated to the least. For
> example, if your GUID were composed of these
> elements (local count, machine id, computer
> manufacturer id, planet id, solar system id, Galaxy
> id) you would expect the last three elements to be
> quite stable, that there would be a few different
> values of manufacturer, a few more of machine id,
> and few duplicate local counts. If the key is built
> in that order, there will be little or no prefix
> compression. If you reverse the order, most entries
> will consiste only of the local count. It's likely
> to be a 5:1 difference in packing density - the
> density is important not to save disk space, but to
> make the index more efficient.
>
> Regards,
> Ann

Ann, thanks for the feedback! I want to make sure I understand this,
because I think I may be doing the opposite of what you are saying! I
agree the space is not what concerns me, the efficiency of the
indicies is most important.

I am thinking that the "most likely to be duplicated" is the Galaxy,
next the Solar System, and Planet... these will be duplicated for
every single record.

The ManufacturerID, MachineID, and LocalCount will be more volitile.

Are you saying that the PK is best when the more volitile elements
are in the front or the back?

Just to give some numbers - Galaxy=11, Solar System=22, Planet=33,
ManufacturerID (range 1 to 5), MachineID (range 1 to 1,000),
LocalCount (range 1 to 10,000)

I was thinking (and that is a frightening thing<grin>) that the
internal binary trees would be the most efficient if I had a nice
even distribution over the values... so in this thinking, I created
the PK like this: (remember my PK is a string)
PK:=LocalCount+MachineID+MfgID+PlanetID+SolarSysID+GalaxyID

With the most volitile components in the front - by looking at the
first "letter/number", I will be able to "weed out" 90% of the
possible records...

If Galaxy, SolarSys, and Planet were first, well I would have to look
at all the letters and all would be the same up until I finally
arrived at the 7th character and then I would finally be able to
start to narrow down the search...

Do I have this backwards... or am I thinking about this correctly?
(maybe I just need more java!!!)

Thanks for the help,
--Raymond