Subject Re: Document Indexing/Querying experience with FB?
Author Tom Conlon
Hi Cosmin,

Many thanks for your detailed reply - I need to study it in detail.

If someone can assist with the most efficient SQL for FB in this
situation (or other suggestions how to attack the problem) it would be
good. I'm going to change it slightly (first OR becomes an AND) and
put wordid in brackets:

("TRAINEE[1] POLICEMAN[2]" AND "TEAM[3] LEADER[4]" ) AND
(MONETARY[5] OR INSURANCE[6] AND "PRIVATE[7] BANK[8]")

This is the expected level of complexity (i.e not nested brackets)
although it would be good to support it if possible.

Current Options:

Option 1. Pure SQL
------------------
SELECT dw1.docid
FROM DOCUMENTWORD dw1
JOIN DOCUMENTWORD dw2 ON (dw2.docid=dw1.docid) AND (dw2.WORDID=2) AND
(dw2.WORDno=dw1.WORDno+1)
JOIN DOCUMENTWORD dw3 ON (dw3.docid=dw1.docid) AND (dw3.WORDID=3)
JOIN DOCUMENTWORD dw4 ON (dw4.docid=dw1.docid) AND (dw4.WORDID=4) AND
(dw4.WORDno=dw3.WORDno+1)
JOIN DOCUMENTWORD dw5 ON (dw5.docid=dw1.docid) AND (dw5.WORDID in (5,6)
JOIN DOCUMENTWORD dw6 ON (dw6.docid=dw1.docid) AND (dw6.WORDID=7)
JOIN DOCUMENTWORD dw7 ON (dw7.docid=dw1.docid) AND (dw7.WORDID=8) AND
(dw7.WORDno=dw6.WORDno+1)
where (dw1.wordid=1)

I think this should work (correct me if you think not) but perhaps
always joining into dw1.docid may/may not be the best idea? Arno?

BTW storing the wordno means you can also do something nice like
"TRAINEE 5 POLICEMAN" meaning trainee has to appear within 5 words of
policeman ...JOIN DOCUMENTWORD dw2 ON (dw2.docid=dw1.docid) AND
(dw2.WORDID=2) AND (dw2.WORDno<dw1.WORDno+5)


Option 2. - In-memory comparisons
---------------------------------
Another approach would be to rip it off the db using an index ordered
by docid,wordid,wordno and performing in-memory comparisons.

Option x,y,z?
-------------

Thanks,
Tom