Deciphering the Undeciphered: HTR for Tangut Manuscripts
The Tangut script — used in the Western Xia dynasty and still largely undeciphered — is one of the most challenging frontiers in historical text recognition. This study pushes into that frontier, developing the first Transkribus HTR model for Tangut manuscripts from a curated dataset of 30–50 pages. The result is a scalable digitisation workflow that brings AI-driven transcription to bear on a fragile, under-resourced script tradition, opening new possibilities for Western Xia studies.
Part of
Poster Presentations