34 Language Installation

34.1 Overview

The Language Installation window allows downloading and installing additional languages for OCR text recognition. By default, only German is installed. For recognition of documents in other languages, the corresponding language packages must be installed.

Access: Menu ToolsInstall OCR Languages…


34.2 Available Languages

The system supports over 100 languages, including:

Western European Languages

  • German (pre-installed)
  • English
  • French
  • Spanish
  • Italian
  • Portuguese
  • Dutch

Eastern European Languages

  • Polish
  • Czech
  • Hungarian
  • Russian
  • Ukrainian

Asian Languages

  • Chinese (simplified and traditional)
  • Japanese
  • Korean
  • Arabic
  • Hebrew

Other Languages

  • The complete list is available in the installation window

34.3 Installation

Selecting a Language

  1. Open the Install OCR Languages window
  2. Find the desired language in the list
  3. Check the checkbox next to the language
  4. Click Install

Download

Language packages (Tessdata) are downloaded from official Tesseract repositories: - File size: 1-50 MB depending on language - Storage location: %APPDATA%\Gillmeister Software\Automatic PDF Processor 2\tessdata\

Multiple Languages

You can select and install multiple languages at once. Downloads proceed sequentially.


34.4 Uninstallation

To remove an installed language: 1. Uncheck the checkbox for the language 2. Click Apply Changes 3. The language file is deleted

Note: German cannot be uninstalled as it is one of the default languages.


34.5 Usage in Profiles

After installation, languages are available in the OCR settings of profiles:

  1. Open a profile → TasksOCR
  2. Select the Primary Language
  3. Optional: Select a Secondary Language for multilingual documents

Primary and Secondary Language

Setting Usage
Primary Language Main language of the document (required)
Secondary Language For documents with mixed text (optional)

Example: A German document with English technical terms: - Primary Language: German - Secondary Language: English


34.6 Tips and Notes

Language Quality

Recognition quality depends on the language: - Very good: Western European languages with Latin script - Good: Eastern European languages, Greek - Variable: Asian languages, depending on print quality

Storage Requirements

Language Type Typical Size
European Languages 1-5 MB
Asian Languages 10-50 MB
All Languages ~500 MB

Offline Installation

For environments without internet access, Tessdata files can be manually downloaded and copied to the tessdata folder.