Lost in Transcription: Why Historical Tibetan Newspapers Demand Specialised AI Over General-Purpose Models

For historical Tibetan newspapers, the transcription isn't the hard part — the layout is. Stacked syllabic scripts, mixed Tibetan-Chinese-Latin pages, and wildly varying column layouts defeat general-purpose AI before it even begins. The TransYolo pipeline meets this challenge head-on with a purpose-built YOLO model for full-page line detection, feeding clean PAGE XML directly into Transkribus for script-specific recognition.