Subject | Text Search + Security |
---|---|
Author | red_october_canada |
Post date | 2008-06-14T23:14:39Z |
I'm building a document handling system. The document files (.doc
mostly) are stored as discrete files in a normal file folder system on
a Windows 2003 server. Each document in the file system has a
corresponding record in a FB 2.0.3 database. The records in the
database contain meta-data about the documents on disk, including
"security" information about if the currently logged in user is
permitted access to the document. All pretty basic stuff... DB design
class 101.
I need to allow my users to do a full text search of the contents of
the documents, a text search of each field of meta-data, and obey the
security settings stored in the database.
Several options are available to me, one of which is to use some kind
of "text extractor" to extract the text of each document and store it
in a BLOB field in the database. (Both keyword/list and count (for
relevancy stats), plus all the words in the document in one continuous
string (for "phrase" searching)). Keeping in mind the 16TB limit of
FB 2.0.3 which I think would be adequate.
Please vote:
1) red_october_canada... you're insane
2) red_october_canada... you're dam insane
3) I actually do this myself, and it works great
4) There is another alternative
If you vote 1 or 2, please say why
If you vote 3, can you recommend a "text extractor"?
If you vote 4, can you tell me what the alternative is?
mostly) are stored as discrete files in a normal file folder system on
a Windows 2003 server. Each document in the file system has a
corresponding record in a FB 2.0.3 database. The records in the
database contain meta-data about the documents on disk, including
"security" information about if the currently logged in user is
permitted access to the document. All pretty basic stuff... DB design
class 101.
I need to allow my users to do a full text search of the contents of
the documents, a text search of each field of meta-data, and obey the
security settings stored in the database.
Several options are available to me, one of which is to use some kind
of "text extractor" to extract the text of each document and store it
in a BLOB field in the database. (Both keyword/list and count (for
relevancy stats), plus all the words in the document in one continuous
string (for "phrase" searching)). Keeping in mind the 16TB limit of
FB 2.0.3 which I think would be adequate.
Please vote:
1) red_october_canada... you're insane
2) red_october_canada... you're dam insane
3) I actually do this myself, and it works great
4) There is another alternative
If you vote 1 or 2, please say why
If you vote 3, can you recommend a "text extractor"?
If you vote 4, can you tell me what the alternative is?