Handwritten Text Recognition in Medieval Germanic Manuscripts: The Peterborough Chronicle as a Case Study
Abstract
Over the past few decades, the mass digitisation of repositories has called for urgent development of computational methods to yield access to their contents. In the realm of text technologies, Handwritten Text Recognition (HTR) is one of the most promising tools, capable of generating semi-automatic machine-readable transcriptions from digitised manuscripts. Although rapid progress is being made in digital palaeography, the implementation of HTR models for medieval Germanic languages remains much more elusive so far. As part of a workflow to produce a digital edition of the Anglo-Saxon Chronicle, I have devised a model for Old English trained on the Peterborough Chronicle using the AI-powered platform Transkribus, returning ~99% accuracy. HTR can transfer annotation across languages: the hypothesis that the model could serve as the basis to build recognition systems for cognate languages will be proved by testing it on the Old Saxon Heliand. While cognisant of the potential of the digital medium, the outcomes of this experiment will emphasise how human analytical work is integral to this process and key to expanding research horizons in the field of digital manuscript studies.
Pubblicato
Fascicolo
Sezione
Licenza
Questo lavoro è fornito con la licenza Creative Commons Attribuzione - Condividi allo stesso modo 4.0.
CC-BY-SA