Enable image OCR

P4 Search uses Tesseract and ImageMagick for optical character recognition (OCR). For Windows installations, Tesseract and ImageMagick must be installed manually.

To learn more about these services, see:

If you do not wish to use Tesseract, you can use one of the following cloud OCR services instead:

  • AzureOcrModel

  • RekognitionOcrModel

  • GoogleOcrModel

To learn more about configuring the cloud OCR services, see Connect an OCR model.

Enable OCR on Linux

During installation, P4 Search automatically installs .deb and .rpm packages for Tesseract and ImageMagick. For other versions of Linux, you must locate and install these packages yourself.

Enable OCR on Windows

To enable OCR on Windows, you must install and enable Tesseract and ImageMagick.

Install Tesseract

To install Tesseract, see Tesseract installer for Windows.

Ensure that when installing Tesseract on Windows you add the location of the tesseract.exe file to the PATH environment variable.

Tesseract uses ImageMagick for OCR. When both Tesseract and ImageMagick are installed, Tesseract takes precedence for P4 Search.

Install ImageMagick

To install ImageMagick, see ImageMagick Windows binary release.

Enable Tesseract OCR

To enable or disable Tesseract OCR for P4 Search, see Configure the index.