Monitor folders - make PDF files automatically searchable

Step-by-step instructions for automatically making PDF files searchable (OCR) with Automatic PDF Processor for Windows

Create a new profile

To create a new profile, click the "New Profile..." button in the upper toolbar. In the configuration window that opens, you can give the profile a meaningful name (for example: Make scanned files searchable) and optionally add a comment, e.g., the source folder. To better distinguish between performed tasks in the log list, you can optionally set any label color.

Create profile for auto-ocr

Set the monitored folder

Select a folder to be monitored. As soon as new PDF files arrive in this folder, they will be detected by the program and automatically processed if the filter criteria are met, in this case made searchable by OCR. Click the "Add..." button and select one of the folders listed there.

Select folders with PDF files

Set one or more filters

You can optionally specify filter criteria to process only specific PDF files. For example, you can use file properties such as part of the file name or document properties such as author, subject, or title. The document text can only be used as a filter after the OCR process in a new profile based on the searchable files. You can combine filter terms with logical AND as well as OR. If you do not enter a filter term, all PDF files arriving in the watched folder will be made searchable automatically.

Set PDF OCR filter

Activate and configure task Make File Searchable

Next, select the category Rename File and set the task status to Active.

First, select the directory where the searchable PDF file should be stored. The input field for the file name can be left empty. In this case, the name of the original file will be used. Now specify the expected language of the document content. You can add more languages using the button next to it. More than 120 languages are available for selection.

For multilingual documents, you can set an additional language. However, this provides a poorer OCR result in many cases and should therefore only be activated with caution.

Before the OCR process, the images in the PDF document are prepared based on the activated options. Rotation correction for skewed scanned pages and deskewing should be enabled if the skew is more than 5 degrees. The options "Re-sharpen" and "Increase contrast" are intended for older files that are only available in a low resolution. In this case, the two options can significantly improve the OCR result. In general, however, the input resolution for the scanning process should be set to at least 225, better to 300 dpi (a resolution higher than 300 dpi does not significantly improve the OCR recognition rate anymore but only increases the file size).

Searchable PDF files are often significantly larger than the original files. If the available disk space is very limited, the DPI value of the output file can be reduced. However, in addition to the required storage space, the OCR recognition rate is also reduced.

Another profile can be used to monitor the storage folder and rename or move the now searchable PDF files to different folders based on their content, for example.

Optimal OCR settings

Status notifications

Here, you can specify whether status messages about the processing of the respective PDF file should be sent to a specific email address. For sending the status message, you can use either the Outlook email account set as default or an email account with user-defined properties. The configuration of the send settings is done in the options.

Send status notifications

Make older PDF files searchable

Finally, there is an option to apply the created profile to all PDF files of a certain period, i.e., to make all matching PDF files searchable. To do this, select the created profile in the profile list and click Catch-up. Otherwise, the program will apply the profile only to new incoming PDF files in the watched folders.

Process older PDF files

Other step-by-step instructions

To the product page of Automatic PDF Processor
Try Automatic PDF Processor now for 30 days...     To the download page