Training a Model for the Automated Transliteration of Romanian Parish Registers Written in Cyrillic (19th-Century Transylvania): Difficulties and Current Progress

Romanian parish registers from 19th-century Transylvania were written in Cyrillic — a script almost no one can read today. This project is building an HTR model in Transkribus to automatically transliterate this hidden archive, tackling the complex mismatch between Cyrillic orthography and Romanian phonology, and opening up thousands of documents to genealogists, historians, and communities across Romania.