Subject Re: Google-like scoring in databases
Author Roman Rokytskyy
> >And you can change the weight depending on the place where you have
> >found catch words (for example, headline match would get more
> >weight than content match).
> >
> Unnecessary, at least in this case. Since the headline is shorter,
> any word in the headline with carry a higher weight than any word in
> the body farther back that the number of words in the headline. But
> keep thinking. There may be other cases.

Interesting observation. I wonder if this is used in Lucene and
ht://dig...

> >I think separate test is needed, if we can get pretty good
> >selectiveness/relevance of database indexing by using only lexical
> >information. From my experience of integrating ht://dig with
> >CoreMedia content management system, you can get good results
> >without using references between documents.
> >
> Are you looking for a solution to a problem or do you have a
> solution looking for a problem?

You develop a content store which happens to use tables to store data
and you need search engine there.

I'm looking for an answer if we need search engine within the Firebird
RDBMS. In Fulda I was discussing plugin architecture of Firebird to be
able to add text search engine and make it available to server-side
code (similar to DB2 Text Extender and Oracle interMedia).

If we find that we do not need search engine in database, and we can
hapily live with one intergrated in the access layer (for example
Lucene extension to EJB/JDO/Hibernate; ht://dig extension for
IBObjects/IBX/FIBPlus), then we do not need plugin architecture in the
first place (at least to solve this issue).

Roman