Publications
2024[ to top ]
-
OCR4all 1.0 – Flexible open-source OCR/HTR based on various single-step Solutions. . In 16th IAPR International Workshop On Document Analysis Systems (DAS). 2024.
-
OCR(-D)4all – An easy to use and highly adaptable Open-Source Solution for Automatic Text Recognition of Historical Printings and Manuscripts. . In 4th Conference for Research Software Engineering in Germany (deRSE). 2024.
-
Edierst Du noch oder trainierst Du schon? Forschungsdaten als Grundlage von Trainingsdaten für die automatische Texterkennung. . In DHd2024: Quo Vadis DH. 2024.
-
Das richtige Tool für die Volltextdigitalisierung. . In DHd2024: Quo Vadis DH. 2024.
2023[ to top ]
-
Analyse, Produktion, Reflexion: Nachnutzungsszenarien für Forschungsdaten am Beispiel der Daten des Projekts Dehmel digital. . In DHd2023: Open Humanities, Open Culture. 2023.
-
Algorithmen-gestützte Analyse visuell-materieller Eigenschaften von Briefen. . In DHd2023: Open Humanities, Open Culture. 2023.
-
Synoptische Interfaces Digitaler Editionen. . In DHd2023: Open Humanities, Open Culture. 2023.
2022[ to top ]
-
Handwritten Text Recognition und Word Mover’s Distance als Grundlagen der digitalen Edition "Die Kindheit Jesu Konrads von Fußesbrunnen". . In DHd2022: Kulturen des digitalen Gedächtnisses. 2022.
-
Semantische Tiefenerschließung historischer Lexika mittels Text- und Typographieerkennung. . In Ursula Rautenberg/Anja Voeste (Hgg.): Typographie. Theoretische Konzeptionen, historische Perspektiven, künstlerische Applikationen. 2022.
-
Open Source Handwritten Text Recognition on Medieval Manuscripts using Mixed Models and Document-Specific Finetuning. . In 15th IAPR International Workshop on Document Analysis Systems (DAS). 2022.
-
OCR4all - Massenvolltextdigitalisierung von Drucken mithilfe von OCR-D und hochqualitative Transkription von Handschriften. . In DHd2022: Kulturen des digitalen Gedächtnisses. 2022.
2021[ to top ]
-
One-Model Ensemble-Learning for Text Recognition of Historical Printings. . In 16th International Conference on Document Analysis and Recognition (ICDAR). 2021.
-
OCR-D & OCR4all: Two Complementary Approaches for Improved OCR of Historical Sources. . In 6th International Workshop on Computational History. 2021.
-
Mixed Model OCR Training on Historical Latin Script for Out-of-the-Box Recognition and Finetuning. . In 6th International Workshop on Historical Document Imaging and Processing (HIP). 2021.
2020[ to top ]
-
Calamari - A High-Performance Tensorflow-based Deep Learning Package for Optical Character Recognition. . In Digital Humanities Quarterly, 14(2). 2020.
2019[ to top ]
-
Ground Truth for training OCR engines on historical documents in German Fraktur and Early Modern Latin. . In JLCL: Special Issue on Automatic Text and Layout Recognition. 2019.
-
State of the Art Optical Character Recognition of 19th Century Fraktur Scripts using Open Source Engines. . In DHd 2019 Digital Humanities: multimedial & multimodal. 2019.
-
OCR4all - An Open-Source Tool Providing a (Semi-)Automatic OCR Workflow for Historical Printings. . In Applied Sciences, 9(22). 2019.
-
Korrektur von fehlerhaften OCR Ergebnissen durch automatisches Alignment mit Texten eines Korpus. . In DHd2019: multimedial & multimodal. 2019.
-
Automatic Semantic Text Tagging on Historical Lexica by Combining OCR and Typography Classification. . In 3rd International Conference on Digital Access to Textual Cultural Heritage (DATeCH). 2019.
2018[ to top ]
-
Comparison of OCR Accuracy on Early Printed Books using the Open Source Engines Calamari and OCRopus. . In JLCL: Special Issue on Automatic Text and Layout Recognition, 33(1), pp. 79–96. 2018.
-
Improving OCR Accuracy on Early Printed Books by combining Pretraining, Voting, and Active Learning. . In JLCL: Special Issue on Automatic Text and Layout Recognition, 33(1), pp. 3–24. 2018.
-
Improving OCR Accuracy on Early Printed Books by Utilizing Cross Fold Training and Voting. . In 13th IAPR International Workshop on Document Analysis Systems (DAS). 2018.
2017[ to top ]
-
Transfer Learning for OCRopus Model Training on Early Printed Books. . In 027.7 Journal for Library Culture. 2017.
-
LAREX – A semi-automatic open-source Tool for Layout Analysis and Region Extraction on Early Printed Books. . In 2nd International Conference on Digital Access to Textual Cultural Heritage (DATeCH). 2017.
-
Case Study of a highly automated Layout Analysis and OCR of an incunabulum: ‘Der Heiligen Leben’ (1488). . In 2nd International Conference on Digital Access to Textual Cultural Heritage (DATeCH). 2017.
2016[ to top ]
-
Cross Dataset Evaluation of Feature Extraction Techniques for Leaf Classification. . In International Journal of Artificial Intelligence & Applications. 2016.
-
Expectation-driven Text Extraction from Medical Ultrasound Images. . In Studies in Health Technology and Informatics. 2016.
-
Autonomous Quadrocopter for Search, Count and Localization of Objects. . In Recent Advances in Robotic Systems. 2016.