Calamari
The open source OCR engine Calamari uses a CNN/LSTM-based approach that allows entire line images to be processed and therefore no longer relies on splitting into individual glyphs. This has considerable advantages, as the time-consuming and error-prone splitting into individual glyphs is no longer necessary, recognition accuracy is increased and the generation of training material is made considerably easier, as only complete lines need to be transcribed.
Calamari is based on the OCRopus Toolbox, but has been considerably extended. The deep neural networks used (deep learning) allow a considerably higher recognition accuracy than the shallow predecessors. In addition, numerous accuracy-enhancing measures have been integrated, such as pre-training, confidence-based voting and data augmentation.
The implementation was carried out in Python3 and the TensorFlow framework is used for the machine learning processes. Calamari supports the application on powerful graphics cards, which leads to considerable speed advantages compared to the application on the CPU, especially during training. The flexible API and the interfaces to complex and widely used OCR data formats such as PAGE or ABBYY XML allow direct integration into existing and future OCR workflows. Calamari is one of the core submodules in OCR4all.
To follow the further development and for suggestions for improvement, we refer to GitHub.
Related Publications
Wick, C., Reul, C., Puppe, F.: Calamari - A High-Performance Tensorflow-based Deep Learning Package for Optical Character Recognition. In: Digital Humanities Quarterly 14,2 (2020). URL
Wick, C., Reul, C., Puppe, F.: Comparison of OCR Accuracy on Early Printed Books using the Open Source Engines Calamari and OCRopus. In: JLCL Special Issue on Automatic Text and Layout Recognition 33,1 (2018), 79-96. URL





