Subject | Text indexing and blob (or blobs as external files) |
---|---|
Author | Garrett Smith |
Post date | 2005-04-17T00:49:32Z |
I've integrated the Lucene text indexer on top of Firebird. I'm using FB
to implement a simple file directory API that Lucene uses for both
indexing and searching.
I'm currently using BLOBs to store the file contents. I've found this
approach to be problematic. Lucene assumes fast, random access file
reads (forward and backward scans). To accommodate, I'm reading the
entire BLOB contents into memory on both indexing and searching
operations.
The reason I'm using FB in the first place is to tie the indexing into
the FB's transaction. I'd much rather store the files in, well, files
:-)
I've heard people mention using external files as 'blob' storage, where
FB stores the file name. I'm having a hard time figuring out how to
handle commits and rollbacks.
Does anyone have experience with this?
-- Garrett
to implement a simple file directory API that Lucene uses for both
indexing and searching.
I'm currently using BLOBs to store the file contents. I've found this
approach to be problematic. Lucene assumes fast, random access file
reads (forward and backward scans). To accommodate, I'm reading the
entire BLOB contents into memory on both indexing and searching
operations.
The reason I'm using FB in the first place is to tie the indexing into
the FB's transaction. I'd much rather store the files in, well, files
:-)
I've heard people mention using external files as 'blob' storage, where
FB stores the file name. I'm having a hard time figuring out how to
handle commits and rollbacks.
Does anyone have experience with this?
-- Garrett