Subject | RE: [firebird-support] searching a varchar(100) column |
---|---|
Author | Helen Borrie |
Post date | 2005-05-01T00:18:33Z |
At 09:11 AM 30/04/2005 +1000, Alan McDonald wrote:
atomic - both come from external sources and are subject to variation and
human error.
Book title is definitely a bad choice, thanks to the way books are
listed. If the title is listed in one source as "Firebird Book, The" and
in another "The Firebird Book", what rule are you going to use?
ISBN is also variable and subject to change. On the book, you see
"1-59059-279-4". Do you key on this string and hope everyone gets the
hyphens in the right positions? or do you eliminate the hyphens (as one
could do by applying a fairly complicated trigger to inserts and updates)
and hope that searchers understand that you don't store hyphens in ISBNs?
The ISBN of a book will change if another publisher takes it up, as occurs
quite frequently with technical books. The original publisher goes broke
and reprints are issued by a new publisher, without the book itself
changing at all. Not all publications have ISBNs, either.
On the size issue, an ISBN without the hyphens is 9 bytes, which doesn't
make it a big loser against a BigInt (8 bytes).
./heLen
> > The problem is that "like" cannot use an index, so everything becomes aI always prefer an atomic primary key. Neither book title nor ISBN is
> > table space scan.
> >
> > I don't know how this will work - It is similar to a couple of
> > suggestions I have already seen in response to your question.
> >
> > 1. make the primary key of your book table something other than the
> > title. ISBN is a good choice. I'll just call this <<primary key>>
> >
> > 2. create a separate keyword to book relationship table, that is
> > structured as keyword, <<primary key>>. I would be tempted to make its
> > primary key both of its columns.
>
>too much disk space/DB pages used for the index. you're always better off
>here with a surrogate integer key as PK. Comments?
atomic - both come from external sources and are subject to variation and
human error.
Book title is definitely a bad choice, thanks to the way books are
listed. If the title is listed in one source as "Firebird Book, The" and
in another "The Firebird Book", what rule are you going to use?
ISBN is also variable and subject to change. On the book, you see
"1-59059-279-4". Do you key on this string and hope everyone gets the
hyphens in the right positions? or do you eliminate the hyphens (as one
could do by applying a fairly complicated trigger to inserts and updates)
and hope that searchers understand that you don't store hyphens in ISBNs?
The ISBN of a book will change if another publisher takes it up, as occurs
quite frequently with technical books. The original publisher goes broke
and reprints are issued by a new publisher, without the book itself
changing at all. Not all publications have ISBNs, either.
On the size issue, an ISBN without the hyphens is 9 bytes, which doesn't
make it a big loser against a BigInt (8 bytes).
./heLen