Data Science for Hard Core Humanists: Opportunities and Challenges from Computational Assyriology

BIDS Data Science Research Seminar


March 4, 2020
4:00pm to 5:00pm
190 Doe Library
Get Directions

Abstract: Over the last 25 years Assyriologists have made great strides in collecting digital representations of cuneiform texts in transliteration. On the Data Science side, institutional developments and projects like Jupyter and Pandas have made computational methods more accessible to scholars of all stripes. The time has come, therefore, for these two developments to meet. The Computational Assyriology project ( aims to illustrate the potential of applying computational methods to cuneiform data by scaling up traditional Assyriological research questions and by building interactive visualizations of models (topic models, social networks, word embeddings) that can be explored in various ways. Will such models actually be used by practitioners in cuneiform studies? As it turns out, the challenges of data acquisition, data cleaning, coding, user interface, and social engineering are all intertwined.

BIDS Data Science Research Seminars feature Berkeley faculty and BIDS collaborators doing visionary research that illustrates the character of data science in this new decade. The series is offered to engage our diverse campus community and to enrich connections, discourse, and discovery among colleagues. All seminars are open to the public, and campus community members are especially encouraged to attend. Arrive half-an-hour early for light refreshments and discussion prior to the formal presentation.


Niek Veldhuis

Professor of Assyriology, Department of Near Eastern Studies, UC Berkeley

Niek Veldhuis is Professor of Assyriology (cuneiform studies) in the Department of Near Eastern Studies. He received his PhD at the Rijksuniversiteit Groningen (The Netherlands) in 1997, and came to Berkeley in 2002. His primary interests are in the intellectual history of ancient Mesopotamia (History of the Mesopotamian Lexical Tradition, 2014) and Sumerian literature (Religion, Literature and Scholarship: The Sumerian Composition Nanše and the Birds, 2004).  He is director of the NEH-supported Digital Corpus of Cuneiform Lexical Texts and is a member of the international Oracc Steering Committee, providing tools and standards for digital publication of cuneiform texts to scholars worldwide. Today, his main research focus is on developing computational text analysis scripts (primarily in Jupyter Notebooks) for cuneiform datasets.