TextXD 2020 and ‘Law & Society’ Hackathon

Text Analysis Across Domains


December 10, 2020 to December 12, 2020
11:00am to 12:00pm
Virtual Participation


TextXD 2020 Text Analysis Across Domains

Dates: December 10-12, 2020 
Location: Virtual Participation
Register to attend and participate. This year’s program is free to attend and will be presented virtually and open to a global audience. All scholars, practitioners, learners, and entrepreneurs, who engage with text analysis in their research are welcome and encouraged to register. Lightning Talk Abstracts - were due by November 1, 2020. Lightning Talk Videos - were due by November 15, 2020. 

TextXD 2020 will convene an interdisciplinary group of practitioners, researchers, learners, and entrepreneurs who work with text as a primary source of data, and who use computational text analysis in a wide range of disciplines.  This year’s 3-day conference will feature invited speakers, panel discussions, and exciting research talks spanning theory, applications, and tools. Participants will be invited to engage actively, learn collaboratively, and deepen their expertise in text analysis by sharing their approaches, perspectives, and solutions, and by supporting each other in their practice.

Talks will range from the theory of text analysis and deep learning to applied analyses and new software packages. The main conference event will feature a series of online video presentations from this year’s speakers, leaders in innovation in text analysis across domains including Law and SocietyUnder-represented Languages in NLP, as well as Health and Environmental Issues.

PROGRAM SUMMARY - Link to Full Agenda

DAY 1: THURSDAY, December 10, 2020
11:00 AM - 6:00 PM Pacific
11:00 AM -- Welcome and Introductions
11:30 AM -- Session 1
: Under-resourced Languages & Literature
1;00 PM -- Session 2: NLP in the Social Sciences 
2:30 PM -- Session 3: NLP in theory and application
3:45 PM -- Lightning Talks Session 1
4:45 PM -- Social Hour

DAY 2: FRIDAY, December 11, 2020
11:00 - 3:45 PM Pacific

11:00 AM -- Welcome and Introductions
11:10 AM -- Session 4
: Health & Life Sciences 
1:00 PM -- Session 5: Real World NLP
2:30 PM -- Lightning Talks Session 2
3:30 PM -- Social Hour

DAY 3: SATURDAY, December 12, 2020
10:00 - 12:00 PM Pacific

10:00 AM -- Hackathon Presentations
11:00 AM -- Closing Remarks and Award Ceremony

TextXD 'Law & Society' Hackathon

Thursday, December 10, at 10:00 AM through Friday, December 12, at 12:00 PM
Apply by December 5 (Applicants will then be notified if selected to attend.).
BIDS’ TextXD 2020 conference on December 10-12 will feature an all-new, 48-hour TextXD ‘Law & Society’ Hackathon event starting at 10:00 AM on Thursday, December 10. The deadline to apply for the TextXD Hackathon has been extended to December 5. The event will conclude with a celebration and awards ceremony on Saturday, December 12, at 10:00 AM, featuring a panel of distinguished judges including Janet Napolitano, Professor of Public Policy at Berkeley's Goldman School of Public Policy, former president of the University of California and former Secretary of Homeland Security under President Obama; and David Barstow, head of investigative reporting at the UC Berkeley Graduate School of Journalism, 4-time Pulitzer prize-winning journalist, and a former senior writer at The New York Times. Hackathon participants will focus on a unique collection of recent datasets that focus on police misconduct and associated public policy issues in California. Read more here: TextXD ‘Law & Society’ Hackathon to focus on police misconduct and data analysis in support of public policy (BIDS News, November 13, 2020)

Confirmed Speakers (as of October 15, 2020)

  • Ken Benoit, Professor of Computational Social Science, London School of Economics
  • David Bamman, Assistant Professor in the School of Information, UC Berkeley
  • Chris Kennedy, Research Fellow in Biomedical Informatics, Harvard Medical School 
  • Matt Levine, Clinical Assistant Professor of English and Director of the Digital Media Lab, University of Pittsburgh
  • Laura Nelson, Assistant Professor of Sociology, Northeastern University 
  • Julia Silge, Data Scientist and Software Engineer at RStudio
  • Irena Spasic, Professor of Computer Science and Informatics, Cardiff University

Program Committee

  • Adam Anderson, UC Berkeley
  • Ciera Martinez, UC Berkeley
  • David Mongeau, UC Berkeley
  • Heather Haveman, UC Berkeley
  • Maryam Vareth, UC Berkeley and UCSF
  • Niek Veldhuis, UC Berkeley

Detailed program information with the schedule of speakers and lightning talks will be available soon on the TextXD website.  

Subscribe to the TextXD Mailing List for further details and updated information. 

Contact: Questions may be directed to textxd@berkeley.edu.


David Bamman

Assistant Professor, School of Information, UC Berkeley

David Bamman is an assistant professor in the School of Information at UC Berkeley, where he works on applying natural language processing and machine learning to empirical questions in the humanities and social sciences. His research often involves adding linguistic structure (e.g., syntax, semantics, coreference) to statistical models of text, and focuses on improving NLP for a variety of languages and domains (such as literary text and social media). Before Berkeley, he received his PhD in the School of Computer Science at Carnegie Mellon University and was a senior researcher at the Perseus Project of Tufts University.

Ken Benoit

Professor, Computational Social Science, London School of Economics

Ken Benoit is a Professor of Computational Social Science in the Department of Methodology at the London School of Economics and Political Science, and Professor (Part-time) in the School of Politics and International Relations at the Australian National University. His current research focuses on computational, quantitative methods for processing large amounts of textual data, mainly political texts and social media. Current interest span from the analysis of big data, including social media, and methods of text mining. His substantive research in political science focuses on comparative party competition, the European Parliament, electoral systems, and the effects of campaign spending.

Chris Kennedy

BIDS Alumni - BIDS-BBDT Data Science Fellow

Chris Kennedy is now a postdoctoral fellow in biomedical informatics at Harvard Medical School, focusing on deep learning and causal inference in Gabriel Brat’s surgical informatics lab. He has a PhD in biostatistics from UC Berkeley. He is a senior fellow at UC Berkeley’s D-Lab and is affiliated with the Integrative Cancer Research Group and the Division of Research at Kaiser Permanente Northern California. At BIDS, he was a BIDS - Biomedical Big Data Training (BBDT) Data Science Fellow and a PhD student in biostatistics at UC Berkeley, where he worked with Alan Hubbard. He was also a D-Lab instructor and consultant, and an NIH biomedical big data trainee. His methodological interests encompassed targeted machine learning, randomized trials, causal inference, deep learning, text analysis, signal processing, and computer vision. His applications were primarily in precision medicine, public health, genomics, and election campaigns. His software projects included the SuperLearner ensemble learning system and varImpact for variable importance estimation; he leverages high performance computing on Savio and XSEDE clusters to accelerate his work. Prior to Berkeley he worked in political analytics in DC, running dozens of randomized trials and integrating machine learning into multi-million dollar programs to improve voter turnout for underrepresented Americans. He has also worked to support climate change action through Al Gore’s Climate Reality Project and the Yale Program on Climate Change Communication. He holds an M.A. in political science from UC Berkeley, an M.P.Aff. from the LBJ School of Public Affairs, and a B.A. in government & economics from The University of Texas at Austin.

Laura K. Nelson

Alumni - BIDS Data Science Fellow

Former BIDS Data Science Fellow Laura K. Nelson is an Assistant Professor of Sociology in the College of Social Sciences and Humanities at Northeastern University. Laura uses computational methods and open source tools - principally automated text analysis - to study social movements, culture, gender, institutions, and organizations. She is particularly interested in developing computational tools that can bolster the way social scientists do inductive and theory-driven research. She received her PhD in sociology from the University of California, Berkeley, and she also holds an MA from UC Berkeley and a BA from the University of Wisconsin, Madison. While at UC Berkeley, she was a postdoctoral fellow with Digital Humanities @ Berkeley, developing a course for undergraduates on computational text analysis in the humanities and social sciences. 

Matthew Lavin

Clinical Assistant Professor of English and Director of the Digital Media Lab, University of Pittsburgh

Matthew Lavin is a Clinical Assistant Professor of English and Director of the Digital Media Lab at the University of Pittsburgh. Lavin’s scholarly work takes place at the intersection of book history and digital humanities, with equal attention to computational methods and humanities data. His research interests also include American literature, the history of authorship, book history and technology, open access and copyright, and digital pedagogy.

Julia Silge

Data Scientist and Software Engineer at RStudio

Julia Silge is a data scientist and software engineer at RStudio where she works on open source modeling tools. As a software developer for the data science ecosystem in R, Silge is most known for her tidytext package which is downloaded from CRAN about 40,000 times per month. Silge studied physics and astronomy, and worked in academia (teaching and doing research) and ed tech before moving into data science and discovering R. She is both an international speaker and a real-world practitioner focusing on data analysis and machine learning practice. She has written books with collaborators about text mining, supervised machine learning for text, and modeling with tidy data principles in R.

Irena Spasic

Professor of Computer Science and Informatics, Cardiff University

Irena Spasic is a professor of Computer Science and Informatics at Cardiff University. Her academic efforts are focused on establishing excellence in research related to text mining, in order to gain knowledge for significant interventions and decision making in the context of big data. She has made contributions in the areas of text classification, information extraction, term recognition and sentiment analysis. Her research has been most often applied in the health and life sciences, where it has led interdisciplinary collaboration with impacts beyond computer science. For example, she is a co-founder of HealTex, the UK Healthcare Text Analytics Network, a multi-disciplinary research network that aims to facilitate the use of healthcare free text (clinical notes, letters, social media post, literature) in research and clinical practice.