Search for PDF content?

Ideas for improvements and requests for new features in XnView MP

Moderators: XnTriq, xnview

Post Reply
docmax
Posts: 4
Joined: Wed May 05, 2021 10:56 am

Search for PDF content?

Post by docmax »

Is there support to search for PDF content inside PDF files? Can this be implemented?
cday
XnThusiast
Posts: 2904
Joined: Sun Apr 29, 2012 9:45 am
Location: Cheltenham, U.K.

Re: Search for PDF content?

Post by cday »

docmax wrote: Wed May 05, 2021 11:05 am Is there support to search for PDF content inside PDF files? Can this be implemented?
PDF files created in XnView software are saved as image bitmaps, the only file property that can be selected when a final image is saved is the image compression...

What type of 'content' are you referring to in your post?
docmax
Posts: 4
Joined: Wed May 05, 2021 10:56 am

Re: Search for PDF content?

Post by docmax »

If a PDF is created with OCR it is usually possible for viewers to search for text.
In Windows for example the Windows Indexer scans PDFs and create indexes to search for text inside PDF files (the "content").

I wonder if this can be implemented into XNView (best via Quick Search).
This would be great!
cday
XnThusiast
Posts: 2904
Joined: Sun Apr 29, 2012 9:45 am
Location: Cheltenham, U.K.

Re: Search for PDF content?

Post by cday »

docmax wrote: Wed May 05, 2021 11:42 am If a PDF is created with OCR it is usually possible for viewers to search for text.
In Windows for example the Windows Indexer scans PDFs and create indexes to search for text inside PDF files (the "content").

I wonder if this can be implemented into XNView (best via Quick Search).
This would be great!
I can't speak for Pierre, but although in principle it could indeed be a useful feature, with his current workload I think it is unlikely to be a priority... :(

The normal way to make a PDF image file searchable, as you likely already know, is to use one of the commercial softwares, such as Adobe Acrobat, Abbyy FineReader or Omnipage, which are now very powerful but quite expensive.

Although there is open source OCR software available, as far as I know they are based on what are now fairly old engines which are likely to be less tolerant of limitations in source images. There are also now online OCR converters available.

As many scanners are supplied with a reduced-functionality version of an OCR software, another possibility might be when possible to use that to make image PDF files searchable.
docmax
Posts: 4
Joined: Wed May 05, 2021 10:56 am

Re: Search for PDF content?

Post by docmax »

I dont want XNView to do the OCR. OCR is allready done and OCR info is allready "embedded" into the PDF document.
The only job XNView has to do is to just search that "metadata".

My "dream" bahviour would be: Type my search pattern into the quick search field and only files with that pattern would show up.
cday
XnThusiast
Posts: 2904
Joined: Sun Apr 29, 2012 9:45 am
Location: Cheltenham, U.K.

Re: Search for PDF content?

Post by cday »

docmax wrote: Wed May 05, 2021 12:02 pm I dont want XNView to do the OCR. OCR is allready done and OCR info is allready "embedded" into the PDF document.
The only job XNView has to do is to just search that "metadata".

My "dream" behavior would be: Type my search pattern into the quick search field and only files with that pattern would show up.
You are suggesting an addition to the search capabilities in the file browser, to search for keywords within searchable PDF files?
docmax
Posts: 4
Joined: Wed May 05, 2021 10:56 am

Re: Search for PDF content?

Post by docmax »

exactly
Post Reply