BIDS TextXD 2020 and ‘Law & Society’ Hackathon

Text Analysis Across Domains


December 10, 2020 to December 12, 2020
11:00am to 12:00pm
Virtual Participation


 TextXD 2020 Text Analysis Across Domains

Dates: December 10-12, 2020 

This year’s program was free to attend, presented virtually and open to a global audience. All scholars, practitioners, learners, and entrepreneurs, who engage with text analysis in their research were welcome and encouraged to register. Lightning Talk Abstracts were due by November 1, 2020. Lightning Talk Videos were due by November 15, 2020. 

TextXD 2020 convened an interdisciplinary group of practitioners, researchers, learners, and entrepreneurs who work with text as a primary source of data, and who use computational text analysis in a wide range of disciplines.  This year’s 3-day conference featured invited speakers, panel discussions, and exciting research talks spanning theory, applications, and tools. Participants were invited to engage actively, learn collaboratively, and deepen their expertise in text analysis by sharing their approaches, perspectives, and solutions, and by supporting each other in their practice.

Talks ranged from the theory of text analysis and deep learning to applied analyses and new software packages. The main conference event featured a series of online video presentations from this year’s speakers, leaders in innovation in text analysis across domains including Law and SocietyUnder-represented Languages in NLP, as well as Health and Environmental Issues.


DAY 1: THURSDAY, December 10, 2020
11:00 AM - 6:00 PM Pacific

11:00 AM -- Welcome and Introductions 

11:30 AM -- Session 1: Under-resourced Languages & Literature 

1:00 PM -- Session 2: NLP in the Social Sciences 

2:30 PM -- Session 3: NLP in theory and application

3:45 PM -- Lightning Talks Session 1

4:45 PM -- Social Hour

DAY 2: FRIDAY, December 11, 2020
11:00 - 3:45 PM Pacific

11:00 AM -- Welcome and Introductions

11:10 AM -- Session 4: Health & Life Sciences 

1:00 PM -- Session 5: Real World NLP

2:30 PM -- Lightning Talks Session 2 

3:30 PM -- Social Hour

DAY 3: SATURDAY, December 12, 2020 
10:00 - 12:00 PM Pacific
10:00 AM -- Hackathon Presentations
11:00 AM -- Closing Remarks and Award Ceremony

TextXD 'Law & Society' Hackathon

Thursday, December 10, at 10:00 AM through Friday, December 12, at 12:00 PM
Apply by December 5 (Applicants will then be notified if selected to attend.).
BIDS’ TextXD 2020 conference on December 10-12 will feature an all-new, 48-hour TextXD ‘Law & Society’ Hackathon event starting at 10:00 AM on Thursday, December 10. The deadline to apply for the TextXD Hackathon has been extended to December 5. The event will conclude with a celebration and awards ceremony on Saturday, December 12, at 10:00 AM, featuring a panel of distinguished judges including Janet Napolitano, Professor of Public Policy at Berkeley's Goldman School of Public Policy, former president of the University of California and former Secretary of Homeland Security under President Obama; and David Barstow, head of investigative reporting at the UC Berkeley Graduate School of Journalism, 4-time Pulitzer prize-winning journalist, and a former senior writer at The New York Times. Hackathon participants will focus on a unique collection of recent datasets that focus on police misconduct and associated public policy issues in California. Read more here: TextXD ‘Law & Society’ Hackathon to focus on police misconduct and data analysis in support of public policy (BIDS News, November 13, 2020)

TextXD 2020 Hackathon — Special Guest "Judges"

  • Roxanna Marie Altholz, Clinical Professor of Law and Co-Director, International Human Rights Law Clinic, UC Berkeley
  • David Barstow, Distinguished Chair in Investigative Journalism, UC Berkeley 
  • Sarah E. Chasins, Assistant Professor, Electrical Engineering and Computer Science, UC Berkeley
  • Janet Napolitano, Professor of Public Policy, Goldman School of Public Policy, and Director, Center for Security in Politics, UC Berkeley


TextXD 2020 Program Committee

  • Adam Anderson, UC Berkeley
  • Alex de Siquiera, UC Berkeley
  • Marsha Fenner, UC Berkeley
  • Ciera Martinez, UC Berkeley
  • David Mongeau, UC Berkeley
  • Heather Haveman, UC Berkeley
  • Maryam Vareth, UC Berkeley and UCSF
  • Niek Veldhuis, UC Berkeley

Subscribe to the TextXD Mailing List for further details and updated information. 

Contact: Questions may be directed to


Adam G. Anderson

BIDS Alum – Research Training Program Manager

Adam G. Anderson advised graduate students in the Computational Social Science Training Program, managed the Computational Social Science Forum, and helped organize BIDS's cross-domain (XD) initiatives.  He was also a lecturer in Digital Humanities and Data Science, and an academic coordinator for Digital Humanities at Berkeley, where he co-authored and designed the Theory and Methods curriculum for the DigHum Minor and Certificate Program. He was also a co-coordinator for the Digital Humanities Working Group (DHWG) and the Computational Text Analysis Working Group (CTAWG), as well as the topic area lead in Network Analysis and Text Analysis at the D-Lab.  His work brings together the fields of computational linguistics, archaeology and Assyriology / Sumerology to quantify the social and economic landscapes emerging during the Bronze Age in the ancient Near East.  His research interests include network analysis, archival studies, geospatial mapping and language modeling (NLP).  He applies these mixed methods to large datasets of ancient texts and archaeological records, in order to better understand the lives of individuals and groups within ancient societies, and to relate these findings within the context of our lives today.  He holds a PhD in Near Eastern Languages and Civilizations from Harvard University, an MA (zwischenprüfung) in Assyriology from  Ludwig-Maximilians University, and a BA in Linguistics from Brigham Young University.

Alex de Siqueira

Assistant Project Scientist

Alex de Siqueira is a postdoctoral researcher at BIDS, working on open source algorithms for processing computed tomography (CT) 3D images. He received his MS and PhD from the State University of São Paulo, Brazil, applying image processing tools to tackle challenges in materials science and geochronology. A core developer of scikit-image, he is an open source and free software enthusiast since his first contact with Linux, in 2000, contributing to several projects and events in Latin America and Europe. Alex also worked as a postdoctoral fellow at the State University of Campinas, Brazil, and the TU Bergakademie Freiberg, Germany, where he created pytracks and wrote Octave - Your first steps on scientific programming (in Brazilian Portuguese).

Marsha Fenner

Communications/Program Manager, Berkeley Institute for Data Science

Marsha Fenner is the Communications/Program Manager for the Berkeley Institute for Data Science. In this role, she works to connect researchers and data science practitioners across a wide array of academic disciplines, facilitate interdisciplinary collaboration, and implement training and education programs that engage and expand BIDS' and Berkeley’s active and diverse research community. Fenner has managed communications, training/education/outreach programs, and administrative operations for scientific programs and research initiatives at UC Berkeley and Lawrence Berkeley National Laboratory, including the Innovative Genomics Institute, the DOE Joint Genome Institute, and Berkeley Lab's Advanced Light Source. She holds an MA in philosophy and comparative religious studies, and a BA in classics, philosophy and mathematics.

Heather A. Haveman

Professor, Department of Sociology, UC Berkeley

Heather A. Haveman is a Professor of Sociology and Business at UC Berkeley. She holds a BA in history and an MBA (from the University of Toronto), and a Ph.D. in organizational behavior and industrial relations (from UC Berkeley).  Following positions at Duke University's Fuqua School of Business, Cornell University's Johnson Graduate School of Management, and Columbia University's Graduate School of Business, Professor Haveman joined UC Berkeley in July 2006. Her research interests include how organizations, the fields in which they are embedded, and the careers of their members and employees evolve. Her current work involves American magazines and wineries, Chinese listed firms, and the emerging marijuana market in several US states. 

Ciera Martinez

Biology and Environmental Sciences Lead

BIDS Biology and Environmental Sciences Lead Ciera Martinez focuses on data intensive research projects that aim to understand how life on this planet evolves in reaction to the environment and climate – especially projects involving large and complex datasets.  A long-time open science advocate, Ciera has been involved with and continues to be interested in working on training for open data, education, publishing, and software, including developing community standards for data management practices.  As a 2019 Mozilla Open Science Fellow, she connected her love of data and museums and worked on projects aimed at understanding and increasing the usability of biodiversity and natural history museum data.  She received her PhD in Plant Biology from UC Davis, researching the genetic mechanisms regulating plant architecture.  She then went on to become a NSF Postdoctoral Fellow at UC Berkeley in the Molecular and Cellular Biology Department, studying genome evolution.  She was also a BIDS postdoctoral Data Science Fellow for 3 years, working on undergraduate research practices, data science training, community development, and best practices for data science, diversity and inclusion, and computational research.  

David Mongeau

Former Executive Director

David Mongeau, now the Founding Director of the School of Data Science at the University of Texas at San Antonio, was the Executive Director of BIDS from April 2018 to June 2021. During that time, in collaboration with the Faculty Director and Faculty Council, he set strategic direction and oversaw the BIDS research, training, and outreach. He also led the institute’s industry and foundation relations and its engagement with other UC and global research institutes, all toward the overarching mission at BIDS to create and deploy data science methods, practices, and technologies to enable discovery. Previously, David co-led the data analytics institute at Ohio State; worked at Battelle, where he championed its proposal for an AI and cybersecurity company, now Covail; and worked for many years at Bell Labs – starting on the team that introduced the first C++ compiler and UNIX System V and leaving after building a global business and technology consulting practice, now part of Nokia Bell Labs Consulting. David earned his undergraduate degree at Carnegie Mellon University, and later earned a graduate degree at Rensselaer Polytechnic Institute and an MBA from Purdue University. Many of his interests lie beyond data science, embracing the humanities and arts.

Maryam Vareth

Health and Life Sciences Lead

Maryam Vareth leads BIDS’ data science research efforts in the Health & Life Sciences.  Dr. Vareth is a Co-Director of the Innovate For Health initiative, a collaboration among UC Berkeley, UCSF, and Janssen Pharmaceutical Companies of Johnson & Johnson. As an experienced engineer, researcher, and data scientist, she applies mathematics, statistics and physics to solve unmet needs in healthcare to enhance patients’ experience during their medical journey. She is an advocate for “data-driven” medicine, and in particular for linking medical imaging data with medical diagnostics and therapeutics to extract clinically-relevant insights through the use of open research and open source practices.  Dr. Vareth received her BS and MS training in Electrical Engineering and Computer Science (EECS) from UC Berkeley, where she was awarded the prestigious Regent’s and Chancellor’s Scholarship.  She completed her PhD through the joint UC Berkeley-UCSF Bioengineering program as a National Science Foundation Fellow, where she was awarded the Margaret Hart Surbeck Endowed Fellowship for Interdisciplinary Research for her work on developing new techniques and algorithms for the acquisition, reconstruction and quantitative analysis of Magnetic Resonance Spectroscopy Imaging (MRSI), with the goal of improving its speed, sensitivity and specificity to improve the management of patients with brain tumors.  She conducted her post-doctoral fellowship at UCSF, combining structural, physiological and metabolic imaging data from large clinical trials to quantitatively characterize heterogeneity within malignant brain tumors.

Niek Veldhuis

Professor of Assyriology, Department of Near Eastern Studies, UC Berkeley

Niek Veldhuis is Professor of Assyriology (cuneiform studies) in the Department of Near Eastern Studies. He received his PhD at the Rijksuniversiteit Groningen (The Netherlands) in 1997, and came to Berkeley in 2002. His primary interests are in the intellectual history of ancient Mesopotamia (History of the Mesopotamian Lexical Tradition, 2014) and Sumerian literature (Religion, Literature and Scholarship: The Sumerian Composition Nanše and the Birds, 2004).  He is director of the NEH-supported Digital Corpus of Cuneiform Lexical Texts and is a member of the international Oracc Steering Committee, providing tools and standards for digital publication of cuneiform texts to scholars worldwide. Today, his main research focus is on developing computational text analysis scripts (primarily in Jupyter Notebooks) for cuneiform datasets.

David Bamman

Assistant Professor, School of Information, UC Berkeley

David Bamman is an assistant professor in the School of Information at UC Berkeley, where he works on applying natural language processing and machine learning to empirical questions in the humanities and social sciences. His research often involves adding linguistic structure (e.g., syntax, semantics, coreference) to statistical models of text, and focuses on improving NLP for a variety of languages and domains (such as literary text and social media). Before Berkeley, he received his PhD in the School of Computer Science at Carnegie Mellon University and was a senior researcher at the Perseus Project of Tufts University.

Sarah E. Chasins

Assistant Professor, Electrical Engineering and Computer Sciences, UC Berkeley

Sarah E. Chasins joined the UC Berkeley EECS faculty in 2020. Her lab invents usable programming tools to democratize computation, especially to empower social scientists, journalists, and other non-traditional programmers. Her research focuses on programming languages (PL) and program synthesis, with an emphasis on (i) work at the intersection of PL and human-computer interaction, and (ii) work at the intersection of PL and social good.

Chris Kennedy

BIDS Alum – Data Science Fellow

Chris Kennedy is an instructor in psychiatry at Harvard Medical School / Massachusetts General Hospital. He has a PhD in biostatistics from UC Berkeley. He is a senior fellow at UC Berkeley’s D-Lab and is affiliated with the Integrative Cancer Research Group and the Division of Research at Kaiser Permanente Northern California. At BIDS, he was a BIDS - Biomedical Big Data Training (BBDT) Data Science Fellow and a PhD student in biostatistics at UC Berkeley, where he worked with Alan Hubbard. He was also a D-Lab instructor and consultant, and an NIH biomedical big data trainee. His methodological interests encompassed targeted machine learning, randomized trials, causal inference, deep learning, text analysis, signal processing, and computer vision. His applications were primarily in precision medicine, public health, genomics, and election campaigns. His software projects included the SuperLearner ensemble learning system and varImpact for variable importance estimation; he leverages high performance computing on Savio and XSEDE clusters to accelerate his work. Prior to Berkeley he worked in political analytics in DC, running dozens of randomized trials and integrating machine learning into multi-million dollar programs to improve voter turnout for underrepresented Americans. He has also worked to support climate change action through Al Gore’s Climate Reality Project and the Yale Program on Climate Change Communication. He holds an M.A. in political science from UC Berkeley, an M.P.Aff. from the LBJ School of Public Affairs, and a B.A. in government & economics from The University of Texas at Austin.

Laura K. Nelson

BIDS Alum – Data Science Fellow

Former BIDS Data Science Fellow Laura K. Nelson is an Assistant Professor of Sociology in the College of Social Sciences and Humanities at Northeastern University. Laura uses computational methods and open source tools - principally automated text analysis - to study social movements, culture, gender, institutions, and organizations. She is particularly interested in developing computational tools that can bolster the way social scientists do inductive and theory-driven research. She received her PhD in sociology from the University of California, Berkeley, and she also holds an MA from UC Berkeley and a BA from the University of Wisconsin, Madison. While at UC Berkeley, she was a postdoctoral fellow with Digital Humanities @ Berkeley, developing a course for undergraduates on computational text analysis in the humanities and social sciences.