Subject Re: Index tales - part 2 - Keyword FTS
Author m_theologos
--- In, Jim Starkey <jas@...>
> Roman Rokytskyy wrote:
> > Jim,
> >
> >
> >> Text search needs to be multi-table and multi-field to be useful.
> >>
> >
> > I guess, you did not change your approach to this topic
> > right? Then your multi-table search can be used only via API and
> > result of such query is set of result sets, which cannot be
> > represented via SQL.
> >
> Yes, that's correct. The result of a search operation is a
> that can be iterated similar to a ResultSet, but the "value" of a
> ResultList is a ResultSet rather than a scalar value.

And so, you'll take out the result from the server's engine. The
entire processing must be done on the client side building the
appropiate functions from scratch. I rather prefer a more SQL
approach, using JOINS, Views aso.

> > But MySQL as well as your Netfrastructure provide a keyword that
> > allowed to perform a query against single table (and they are not
> > only ones - Oracle and MS SQL do similar things). Do you want to
> > that MATCH/MATCHING operators and alike is not useful at all?
> >
> I have a customer who is quite happy using the MATCHING predicate
> against a single column. The implementation is a general search
> constrained to a single table and column. The semantics are
> the same.
> The Falcon search code will be part of the initial Falcon alpha
> base even though it isn't accessible through MySQL. Like other
> code, it will be released under the GPL, but the ideas will be
there for
> the taking. MySQL has an existing full text search capability that
> would prefer not to comment on publicly.
> > I'd say that suggested keyword search is a simplified version of
> > operator that accesses a multi-field FTS index, however only for
> > table. Having that case implemented would be of great benefit for
> > Firebird, considering amount of applications that use FTS via SQL.
> > Don't you agree?
> >

I agree. And I think that is easy to implement it. Of course if you
want a more advanced approach the things change. Then we'll do a more
dedicated structure, IMHO. If you're interested, drop a line.

Also, please observe that, generally speaking, each column to be
indexed tend to has its own vocabulary. For example:

On a ERP:

The ACCOUNTS.DESCRIPTION will have a quite different vocabulary in
let's say all of these fields has the same data type (or domain).

On a CRM:

The CONTACTS.ADDRESS will be very different lexically compared with
last one has other separators, other stop words etc.)

So, the index will be much bigger than we need for a table only FTS
search. Also, builiding a muti table structure needs new procs to
deal with updating this, searching, retreiving aso. I only propose a
usefull feature (IMHO) which can be implemented relatively simple,
based on a (very) verified engine. If we want (let's say in version
4) to add proximity search, synonims, multi-table parallel keywords,
multi-language fuzzy search, so far so good.

I think that (IMHO) is better to implement step-by-step things (no
stupid crap inside the engine of course), rather than leaving
unimplemented a feature because we cannot make it perfect from the

> Personally, I believe that web style applications are better
> applications that the traditional File/Edit/View framework, and web
> application begin with search. There are good uses for a heavily
> restrictive search, however. In my book, implementing a multi-
> multi-field index then, if appropriate, filtering at the node
> level to a single table and field makes a great deal of sense.

...but please take in consideration that behind the web page is an
app which deals with concrete kinds of data orgainzed in tables,
"kinds" from _human_ point of view (I don't mean here 'data types'),
so as you observed, 'There are good uses for a heavily restrictive
search...' In conclusion, I think that a multi-table multi-field FTS
index is good to have but having only this is a 'heavy' thing to deal
with IMHO. (No SQL, lack of speed, difficult to refine aso.)

hth, (my 2c)

m. th.