TextXD 2020: BIDS’ cross-domain conference to focus on text analysis applications in social justice, human health, and environmental research

October 15, 2020

Registration is now open for TextXD 2020, which will convene an interdisciplinary group of practitioners, researchers, learners, and entrepreneurs who work with text as a primary source of data, and who use computational text analysis in a wide range of disciplines. 

This year’s three-day virtual conference on December 10-12 will feature invited speakers, panel discussions, and exciting research talks spanning theory, applications, and tools. Participants will be invited to engage actively, learn collaboratively, and deepen their expertise in text analysis by sharing their approaches, perspectives, and solutions, and by supporting each other in their practice. Participants in BIDS’ TextXD (cross-domain) initiative are text processing experts from academia, research, and industry, who work together to (1) identify common principles, algorithms and tools to advance text-intensive research; (2) break down the boundaries between research domains; and (3) foster connections and new collaborations among like-minded researchers. 

This year’s program is free to attend and open to a global audience. All scholars, practitioners, learners, and entrepreneurs, who engage with text analysis in their research are welcome and encouraged to register. 

Talks will range from the theory of text analysis and deep learning to applied analyses and new software packages. The main conference event will feature a series of online video presentations from this year’s speakers, leaders in innovation in text analysis across domains including Law and Society, Under-represented Languages in NLP, as well as Health and Environmental Issues.

Those who wish to participate more actively in the presentations may submit a short (3-5 minute) lightning talk about their research. Participants are welcome to submit lightning talk abstracts by November 1, and to submit lightning talk videos by November 15.

This year’s conference will also feature a three-day (48 hour) “Law and Society” Hackathon on December 10-12. Participants will focus on analyzing text datasets from investigative reporting on the subject of police (mis)conduct. Those interested are welcome to apply for this year’s TextXD Hackathon by November 15 (applicants will be notified if selected to attend).

Confirmed speakers (so far) include:

David BammanDavid Bamman is an assistant professor in the School of Information at UC Berkeley, where he works in the areas of natural language processing and cultural analytics, applying NLP and machine learning to empirical questions in the humanities and social sciences. His research focuses on improving the performance of NLP for underserved languages and domains like literature (including LitBank and BookNLP) and exploring the affordances of empirical methods for the study of literature and culture. 

Ken BenoitKen Benoit is a Professor of Computational Social Science in the Department of Methodology at the London School of Economics and Political Science, and a Professor (part-time) in the School of Politics and International Relations at the Australian National University. His current research focuses on computational, quantitative methods for processing large amounts of textual data, mainly political texts and social media. His current interests span from the analysis of big data, including social media, and methods of text mining. His substantive research in political science focuses on comparative party competition, the European Parliament, electoral systems, and the effects of campaign spending. 

Chris KennedyChris Kennedy is a postdoctoral research fellow in Gabriel Brat’s surgical informatics lab. Kennedy's primary projects are 1) using deep learning-based computer vision to understand surgical videos, and 2) reducing opioid abuse with machine learning and causal inference. He holds a PhD in biostatistics from UC Berkeley, where he worked with Alan Hubbard and Mark van der Laan. His research interests include targeted causal inference (exposure mixtures, optimal individualized treatment regimes, variable importance), deep learning (NLP, computer vision, time series), machine learning, item response theory, experimental design, and survey methods. 

Matthew LavinMatthew Lavin is a Clinical Assistant Professor of English and Director of the Digital Media Lab at the University of Pittsburgh. Lavin’s scholarly work takes place at the intersection of book history and digital humanities, with equal attention to computational methods and humanities data. His research interests also include American literature, the history of authorship, book history and technology, open access and copyright, and digital pedagogy.

Laura NelsonLaura Nelson is an Assistant Professor of Sociology, in the College of Social Sciences and Humanities, at Northeastern University.  She is also core faculty at the NULab for Text, Maps, and Networks; a Faculty Affiliate at the Network Science Institute; an Executive Committee member of the Women’s, Gender, and Sexuality Studies Program; and a member of the Editorial Board of Signs. Nelson uses computational tools, principally automated text analysis, to study social movements, culture, gender, institutions, and organizations. She is an open source and open science enthusiast, seeking to use open-source tools and computational methods to make the social sciences and humanities more transparent, reproducible, and scalable.

Julia SilgeJulia Silge is a data scientist and software engineer at RStudio, where she works on open source modeling tools. As a software developer for the data science ecosystem in R, Silge is most known for her tidytext package which is downloaded from CRAN about 40,000 times per month. Silge studied physics and astronomy, and worked in academia (teaching and doing research) and ed tech before moving into data science and discovering R. She is both an international speaker and a real-world practitioner focusing on data analysis and machine learning practice. She has written books with collaborators about text mining, supervised machine learning for text, and modeling with tidy data principles in R.

Irena SpasicIrena Spasic is a professor of Computer Science and Informatics at Cardiff University. Her academic efforts are focused on establishing excellence in research related to text mining, in order to gain knowledge for significant interventions and decision making in the context of big data. She has made contributions in the areas of text classification, information extraction, term recognition and sentiment analysis. Her research has been most often applied in the health and life sciences, where it has led to interdisciplinary collaboration with impacts beyond computer science. For example, she is a co-founder of HealTex, the UK Healthcare Text Analytics Network, a multi-disciplinary research network that aims to facilitate the use of healthcare free text (clinical notes, letters, social media post, literature) in research and clinical practice.

New speakers and detailed program information will be available soon on the TextXD website.  

Subscribe to the TextXD Mailing List for further details and updated information.

Questions about the TextXD initiative and this event may be directed to textxd@berkeley.edu.



Featured Fellows

David Bamman

Berkeley School of Information
Faculty Affiliate

Chris Kennedy

Biostatistics, UC Berkeley
Alumni - BIDS-BBDT Data Science Fellow

Laura K. Nelson

Digital Humanities
Alumni - Data Science Fellow

Adam G. Anderson

Research Training Program Manager

Heather Haveman

Sociology and Business/Management, Haas School of Business, UC Berkeley
BIDS Faculty Council

Ciera Martinez

Biodiversity and Environmental Sciences Lead

David Mongeau

Executive Director

Maryam Vareth

Health and Life Sciences Lead

Niek Veldhuis

Assyriology, Near Eastern Studies, UC Berkeley
BIDS Faculty Council