Subject | Re: [firebird-support] Text search ... |
---|---|
Author | Andrew Lowe |
Post date | 2019-03-13T02:35:46Z |
On 13/03/19 06:44, Lester Caine lester@... [firebird-support] wrote:
with pdf's, databases etc.
Andrew
> I've got a few of sites where I've got a growing number of pdf filesMaybe you might want to have a look at Zotero. It does a lot of stuff
> which it would be nice to actually index the content. First problem is
> obviously the different qualities of pdf, and I've had finereader
> deployed in some cases to provide OCRed copies of the original, with the
> usual variable success. The question is just what is the best base to be
> working towards. I'm currently working on the basis that we store the
> original file, and I create thumbnails of the front page so I'm now
> looking to striping the raw text. Anybody been there already? Any
> suggestions for Linux based solutions ...
>
> The current indexing process is pulling a list of words from the
> document and building a manual index. It was first working pre-Firebird
> and has not changed so is there a better was with FB3?
>
with pdf's, databases etc.
Andrew