HISMET: From Transcription to Thematic Classification of Early Modern Government Records
65,000 pages of 17th-century Dutch government records, transcribed via Transkribus — but how do you find the themes inside them? HISMET (HIStorical Themes via METadata) answers that question by combining embedding models with interactive t-SNE visualisations: a researcher identifies a theme in one cluster, and the label propagates automatically to all computationally similar documents. The result is a scalable thematic classification system that extends the Transkribus workflow from text recognition into genuine content analysis, generating reusable metadata for early modern collections.
Part of
Poster Presentations