Step-by-step instructions for automatically splitting PDF files with Automatic PDF Processor for Windows
Introduction
In this tutorial, we will show you how to set up a hot folder to auto-split PDF files. After the profile has been created, all
new PDF files in this folder will be automatically split into multiple, separate documents according to the rules you defined
here. You can also batch split already existing PDF documents using the Catch-up function from the upper toolbar.
Create a new profile
Click the "New profile..." button in the toolbar to create a new profile. Enter a meaningful profile name in the
configuration window - for example, "Split collective invoices" or "Split collective protocols". Optionally,
add a comment, for example, the destination folder. You can have the profile color-coded to quickly distinguish performed tasks
in the log list.
Specify the folder to be monitored
Next, specify one (optionally several) folders to be monitored. As soon as new PDF files arrive in the folder, the program will
detect and process them automatically - in this case, splitting them. Click the Add button and select one of the folders listed
there.
Set up one or more filters
Optionally, set different filter criteria here to separate only specific PDF files. You can use file properties like part of the
file name or document properties like author, subject, or even the text. Filter terms can be combined with logical AND as well as
OR. If you do not enter a filter term, the software will automatically split all PDF files entering the watched folder.
Enable and configure the Split File task
In this category, you define the directory in which the separated individual documents are to be saved. Optionally, you can use dynamic
contents for the folder structure and/or the file name. First, determine the base directory, e.g., "D:\Separated files". In
the field Subfolder, you can use dynamic properties (e.g., part of the file name) of the original PDF file. Click on Placeholder and
select the appropriate entry. Various properties of the original document can also be integrated into the file names of the individual
documents. The preview is calculated based on the previously added sample files.
Note: The placeholders here refer to the original file. In order to name the separated individual documents dynamically (for
example with extracted text components, such as an invoice or protocol number), the separated files must first be saved in an intermediate
directory. This intermediate directory must then be monitored by another profile in which automatic renaming or automatic moving
to the target directory is configured
(see instructions:
Rename PDF files automatically).
Furthermore, you can specify here how the program should behave if a file with the same name already exists.
Determine the type of splitting
Determine here how PDF files are separated. The following options are available:
- Number of pages
- File size
- Top level bookmarks
- Keywords
- Barcode or QR code
- Placeholder (when values of extracted data change)
- Blank pages
To save each page of a document as an individual file, select "Number of pages" and set "Max 1 page". Another
commonly used splitting method is the use of keywords. This can also be used to exclude unwanted pages. For "From -> page
contains:" enter a term that occurs on the first page of each individual document, for example, "protocol number:".
Optionally, for "To -> Page contains:" enter a term that occurs on the last page of each individual document, for
example, "Total:". If one-page documents are expected, the same term can be used again, here "Protocol number:
". Intermediate pages without text or the search term are skipped with this type of splitting, i.e., not extracted.
When splitting documents according to a fixed number of pages, an intermediate directory can again be used and a second profile then
moves only those documents that meet certain filter criteria to the actual target directory.
The software provides additional filtering capabilities allowing you to exclude pages from the splitting process. For example, it is
possible to discard pages without text or pages with or without some specific keywords.
Status notifications
Finally, it is possible to specify whether status messages about the processing of the respective PDF file (success, error, no
match, no text, ...) should be sent to a specific email address. For sending the status message, either the default Outlook email account
can be used or an email account with user-defined properties. After successfully splitting a PDF document, any
wave file can also be played.