Why does language inclusivity matter in text recognition and otherwise in the age of AI?

Why does language inclusivity matter in text recognition and otherwise in the age of AI? Does it matter at all? The rapid development of digital humanities has led to the diversification of the field with a wide array of tools and methods. However, the question of language remains a problem in DH practice because most languages, and particularly non-Latin scripts, are still significantly underrepresented and under-resourced in this field.

This talk considers the complexities of this situation in relation to multilingual DH through individual and data-driven collaborative research projects, focusing on both conceptual perspectives on language inclusivity in digital scholarship and more pragmatic considerations regarding its "users" through UX methods. It opens up a space to discuss the realities and challenges of the lifecycle of a multilingual DH project through case studies pointing at broader issues that multilingual DH as a phenomenon reveals about the importance of language diversity towards equal access to digital tools — particularly in the context of text recognition models.

Arguing for a need for a more multifaceted understanding of the realities of those using multilingual DH, the talk also showcases concrete examples of initiatives (in advocacy and on an organisational and infrastructural level as well) to enhance language diversity towards a more inclusive DH ecosystem.