Subject RE: [firebird-support] searching a varchar(100) column
Author David Johnson
On Sat, 2005-04-30 at 09:11 +1000, Alan McDonald wrote:
> > The problem is that "like" cannot use an index, so everything becomes a
> > table space scan.
> >
> > I don't know how this will work - It is similar to a couple of
> > suggestions I have already seen in response to your question.
> >
> > 1. make the primary key of your book table something other than the
> > title. ISBN is a good choice. I'll just call this <<primary key>>
> >
> > 2. create a separate keyword to book relationship table, that is
> > structured as keyword, <<primary key>>. I would be tempted to make its
> > primary key both of its columns.
> too much disk space/DB pages used for the index. you're always better off
> here with a surrogate integer key as PK. Comments?
Disk space is cheap - $1 per gigabyte. The surrogate integer key
requires an additional field and does not help in the business at hand.

The only purpose of the PK on the relationhip table is to make the table
act as an index on the books table, and keep the relationships unique.
You can't index on substrings, but this is a way that you can fake it.
If the PK of the books table is the ISBN, the relationship index will
still be quite small.

> >
> > 3. select b.title, <<primary key>>, count(*)
> > from kwd_book_rel a,
> > books b
> > where a.keyword in (.....)
> > and b.<<primary key>> = a.<<primary key>>
> > group by b.title
> > order by 2 desc
> >
> > will return a list of all titles containing the keywords, in order of
> > most to least hits.
> >
> > The list can be pared down more with a little effort, but that's the
> > gist.
> now all we need is a UDF/SP/methodology to do the parsing.
> Maybe client side parsing with insert queries. duplicates could be ignored
> under this schema and removed during a housekeeping procedure with little
> penalty
> Alan


String [] s =s title.split(" ");
for (int i=0;i<s.length;i++)