Search for PDF content?
Moderators: XnTriq, helmut, xnview
Search for PDF content?
Is there support to search for PDF content inside PDF files? Can this be implemented?
Re: Search for PDF content?
PDF files created in XnView software are saved as image bitmaps, the only file property that can be selected when a final image is saved is the image compression...
What type of 'content' are you referring to in your post?
Re: Search for PDF content?
If a PDF is created with OCR it is usually possible for viewers to search for text.
In Windows for example the Windows Indexer scans PDFs and create indexes to search for text inside PDF files (the "content").
I wonder if this can be implemented into XNView (best via Quick Search).
This would be great!
In Windows for example the Windows Indexer scans PDFs and create indexes to search for text inside PDF files (the "content").
I wonder if this can be implemented into XNView (best via Quick Search).
This would be great!
Re: Search for PDF content?
I can't speak for Pierre, but although in principle it could indeed be a useful feature, with his current workload I think it is unlikely to be a priority...docmax wrote: ↑Wed May 05, 2021 11:42 am If a PDF is created with OCR it is usually possible for viewers to search for text.
In Windows for example the Windows Indexer scans PDFs and create indexes to search for text inside PDF files (the "content").
I wonder if this can be implemented into XNView (best via Quick Search).
This would be great!
The normal way to make a PDF image file searchable, as you likely already know, is to use one of the commercial softwares, such as Adobe Acrobat, Abbyy FineReader or Omnipage, which are now very powerful but quite expensive.
Although there is open source OCR software available, as far as I know they are based on what are now fairly old engines which are likely to be less tolerant of limitations in source images. There are also now online OCR converters available.
As many scanners are supplied with a reduced-functionality version of an OCR software, another possibility might be when possible to use that to make image PDF files searchable.
Re: Search for PDF content?
I dont want XNView to do the OCR. OCR is allready done and OCR info is allready "embedded" into the PDF document.
The only job XNView has to do is to just search that "metadata".
My "dream" bahviour would be: Type my search pattern into the quick search field and only files with that pattern would show up.
The only job XNView has to do is to just search that "metadata".
My "dream" bahviour would be: Type my search pattern into the quick search field and only files with that pattern would show up.
Re: Search for PDF content?
You are suggesting an addition to the search capabilities in the file browser, to search for keywords within searchable PDF files?docmax wrote: ↑Wed May 05, 2021 12:02 pm I dont want XNView to do the OCR. OCR is allready done and OCR info is allready "embedded" into the PDF document.
The only job XNView has to do is to just search that "metadata".
My "dream" behavior would be: Type my search pattern into the quick search field and only files with that pattern would show up.