Step-by-step instructions for automatically making PDF files searchable (OCR) with Automatic PDF Processor for Windows
Create a new profile
To create a new profile, click the "New Profile..." button in the upper toolbar. In the configuration window that
opens, you can give the profile a meaningful name (for example: Make scanned files searchable) and optionally add a comment,
e.g., the source folder. To better distinguish between performed tasks in the log list, you can optionally set any label color.
Set the monitored folder
Select a folder to be monitored. As soon as new PDF files arrive in this folder, they will be detected by the program and
automatically processed if the filter criteria are met, in this case made searchable by OCR. Click the "Add..."
button and select one of the folders listed there.
Set one or more filters
You can optionally specify filter criteria to process only specific PDF files. For example, you can use file properties such as
part of the file name or document properties such as author, subject, or title. The document text can only be used as a filter
after the OCR process in a new profile based on the searchable files. You can combine filter terms with logical AND as well as
OR. If you do not enter a filter term, all PDF files arriving in the watched folder will be made searchable automatically.
Activate and configure task Make File Searchable
Next, select the category Rename File and set the task status to Active.
First, select the directory where the searchable PDF file should be stored. The input field for the file name can be left
empty. In this case, the name of the original file will be used. Now specify the expected language of the document content. You
can add more languages using the button next to it. More than 120 languages are available for selection.
For multilingual documents, you can set an additional language. However, this provides a poorer OCR result in many cases and
should therefore only be activated with caution.
Before the OCR process, the images in the PDF document are prepared based on the activated options. Rotation correction for
skewed scanned pages and deskewing should be enabled if the skew is more than 5 degrees. The options "Re-sharpen" and
"Increase contrast" are intended for older files that are only available in a low resolution. In this case, the two
options can significantly improve the OCR result. In general, however, the input resolution for the scanning process should be
set to at least 225, better to 300 dpi (a resolution higher than 300 dpi does not significantly improve the OCR recognition
rate anymore but only increases the file size).
Searchable PDF files are often significantly larger than the original files. If the available disk space is very limited, the
DPI value of the output file can be reduced. However, in addition to the required storage space, the OCR recognition rate is
also reduced.
Another profile can be used to monitor the storage folder and rename or move the now searchable PDF files to different folders
based on their content, for example.
Status notifications
Here, you can specify whether status messages about the processing of the respective PDF file should be sent to a
specific email address. For sending the status message, you can use either the Outlook email account set as default or an email
account with user-defined properties. The configuration of the send settings is done in the options.
Make older PDF files searchable
Finally, there is an option to apply the created profile to all PDF files of a certain period, i.e., to make all matching PDF
files searchable. To do this, select the created profile in the profile list and click Catch-up. Otherwise, the program will
apply the profile only to new incoming PDF files in the watched folders.