Subject Re: [Firebird-Architect] Re: Full Text Search
Author Jim Starkey
davidrushby wrote:

>My original question in this thread was one of optimization rather
>than possibility. Specifically, are the full-text search
>infrastructures integrated into relational databases standard inverted
>index designs, like Lucene, or are the designs heavily modified to
>better fit into a relational environment?
Netfrastructure is designed from the ground up as a backing store and
engine for 100% dynamic web sites. The sine qua non was effective
search. There was no requirement, let alone reasonable expectation, of
converting existing applications to this environment, the exercise began
with a clean sheet of paper.

Search begins without a knowledge of what information exists, let alone
how it might be structured. The relational model is very effective for
organizating data for open ended applications. Intuitive, it is not.
The relational model is an excellent tradeoff of data abstraction and
practicallity (at least after a couple of decades of hard work), but no
relational being could consider it an end in its own right. As one of
the earliest converts still practicing the art, I think I know the
strengths and the weaknesses.

The relational model captures regularity well -- much better than, say,
html. But any non-trivial collection of information has data of many
types. Expecting a user to user stand the types before he can express a
question is nothing but foolishness.

Search can be melded to the relation model straightforwardly. It does,
however, require thinking of data and information as something more rich
than a card readers, or even a card reader with types and triggers.

>As for how many tables, rows, and columns an individual query should
>cover, I'd say that should be left up to the end user, rather than
>defined by the database engine.
Users don't think in terms of tables and fields. Nor should they. They
think in in terms of information and how it can be manipulated. Search
reveals a set of records. Each record has a particular representation,
maybe by itself, more likely including other records related by key.
Given a record representation and the privileges associated with the
user there are a set of logical operations that should be available to
the user to interact with the content subject to rules laid down by the
application designer.

>I don't have an opinion as to whether full text search functionality
>even belongs in a relational database engine, because I've never used
>a DB that featured it. But in any case, I don't see how it would be
>possible for every current and future user of Firebird to agree on
>what "SIMILAR TO" means, or to agree on a ranking algorithm, or a
>query syntax (does 'usual suspects' mean (phrase "usual suspects"),
>(term "usual") AND (term "suspects"), (term "usual") OR (term
>"suspects"), ...).
Have you used the IBPhoenix knowledge base? If so, you have used a
relational database with full search functionality. Given the amount of
the time that the little yellow disk light stays on, I'd say quite a few
people have mastered the art.

If you want to know what a relation database stored procedure can do,
take a look at