Challenging Error Correction in Recognised Byzantine Greek

Content

Abstract
Authors
Shortfacts

Abstract

Automatic correction of errors in Handwritten Text Recognition (HTR) output poses persistent challenges yet to be fully resolved. In this study, we introduce a shared task aimed at addressing this challenge, which attracted 271 submissions, yielding only a handful of promising approaches. This paper presents the datasets, the most effective methods, and an experimental analysis in error-correcting HTRed manuscripts and papyri in Byzantine Greek, the language that followed Classical and preceded Modern Greek. By using recognised and transcribed data from seven centuries, the two best-performing methods are compared, one based on a neural encoder-decoder architecture and the other based on engineered linguistic rules. We show that the recognition error rate can be reduced by both, up to 2.5 points at the level of characters and up to 15 at the level of words, while also elucidating their respective strengths and weaknesses.

Top

Authors

Pavlopoulos, John
Kougia, Vasiliki
Arias, Esteban Garces
Platanou, Paraskevi
Shabalin, Stepan
Liagkou, Konstantina
Papadatos, Emmanouil
Essler, Holger
Camps, Jean-Baptiste
Fischer, Franz

Top

Shortfacts

Category	Paper in Conference Proceedings or in Workshop Proceedings (Paper)
Event Title	62nd Annual Meeting of the Association for Computational Linguistics (ACL 2024)
Divisions	Data Mining and Machine Learning
Subjects	Kuenstliche Intelligenz
Event Location	Bangkok, Thailand
Event Type	Workshop
Event Dates	11-16 August 2024
Series Name	Proceedings of the 1st Workshop on Machine Learning for Ancient Languages (ML4AL 2024)
Page Range	pp. 1-12
Date	2024
Export

Top