Subject Re: [firebird-support] Text search ...
Author Andrew Lowe
On 13/03/19 06:44, Lester Caine lester@... [firebird-support] wrote:
> I've got a few of sites where I've got a growing number of pdf files
> which it would be nice to actually index the content. First problem is
> obviously the different qualities of pdf, and I've had finereader
> deployed in some cases to provide OCRed copies of the original, with the
> usual variable success. The question is just what is the best base to be
> working towards. I'm currently working on the basis that we store the
> original file, and I create thumbnails of the front page so I'm now
> looking to striping the raw text. Anybody been there already? Any
> suggestions for Linux based solutions ...
>
> The current indexing process is pulling a list of words from the
> document and building a manual index. It was first working pre-Firebird
> and has not changed so is there a better was with FB3?
>

Maybe you might want to have a look at Zotero. It does a lot of stuff
with pdf's, databases etc.

Andrew