Subject | Google-like scoring in databases |
---|---|
Author | Roman Rokytskyy |
Post date | 2003-06-29T11:47:11Z |
Hi,
Recently I indexed JDK 1.3.1 javadocs with Lucene. Searching for
"(input stream)~5" (give all documents having "input" and "stream" and
distance between them is not more than 5 words) scores
javax.sound.sampled.AudioInputStream higher than java.io.InputStream.
What do you think, does it make any sense to have Google-like scoring
system in relational databases?
Under "Google-like" scoring I mean scoring when not only lexical
scoring of the document is relevant, but also number of references on
that document. How would we define references in this case?
Jim, some time ago you have described approach you implemented in
Netfrastructure where multiple results sets are returned by search.
Any comments from that point of view?
Best regards,
Roman Rokytskyy
Recently I indexed JDK 1.3.1 javadocs with Lucene. Searching for
"(input stream)~5" (give all documents having "input" and "stream" and
distance between them is not more than 5 words) scores
javax.sound.sampled.AudioInputStream higher than java.io.InputStream.
What do you think, does it make any sense to have Google-like scoring
system in relational databases?
Under "Google-like" scoring I mean scoring when not only lexical
scoring of the document is relevant, but also number of references on
that document. How would we define references in this case?
Jim, some time ago you have described approach you implemented in
Netfrastructure where multiple results sets are returned by search.
Any comments from that point of view?
Best regards,
Roman Rokytskyy