TextXD 2020 — Text Analysis Across Domains

TextXD

TextXD 2020 - Title Slide

Speaker(s)

Adam G. Anderson

BIDS Alum – Research Training Program Manager

Adam G. Anderson advised graduate students in the Computational Social Science Training Program, managed the Computational Social Science Forum, and helped organize BIDS's cross-domain (XD) initiatives.  He was also a lecturer in Digital Humanities and Data Science, and an academic coordinator for Digital Humanities at Berkeley, where he co-authored and designed the Theory and Methods curriculum for the DigHum Minor and Certificate Program. He was also a co-coordinator for the Digital Humanities Working Group (DHWG) and the Computational Text Analysis Working Group (CTAWG), as well as the topic area lead in Network Analysis and Text Analysis at the D-Lab.  His work brings together the fields of computational linguistics, archaeology and Assyriology / Sumerology to quantify the social and economic landscapes emerging during the Bronze Age in the ancient Near East.  His research interests include network analysis, archival studies, geospatial mapping and language modeling (NLP).  He applies these mixed methods to large datasets of ancient texts and archaeological records, in order to better understand the lives of individuals and groups within ancient societies, and to relate these findings within the context of our lives today.  He holds a PhD in Near Eastern Languages and Civilizations from Harvard University, an MA (zwischenprüfung) in Assyriology from  Ludwig-Maximilians University, and a BA in Linguistics from Brigham Young University.

Alex de Siqueira

Assistant Project Scientist, Data Science Outreach Lead

Alex de Siqueira is a postdoctoral researcher at BIDS, working on open source algorithms for processing computed tomography (CT) 3D images. He received his MS and PhD from the State University of São Paulo, Brazil, applying image processing tools to tackle challenges in materials science and geochronology. A core developer of scikit-image, he is an open source and free software enthusiast since his first contact with Linux, in 2000, contributing to several projects and events in Latin America and Europe. Alex also worked as a postdoctoral fellow at the State University of Campinas, Brazil, and the TU Bergakademie Freiberg, Germany, where he created pytracks and wrote Octave - Your first steps on scientific programming (in Brazilian Portuguese).

Marsha Fenner

Communications/Program Manager, Berkeley Institute for Data Science

Marsha Fenner works to connect researchers, to facilitate interdisciplinary collaboration, and to implement training, education, and outreach programs that enhance and expand BIDS' and Berkeley’s vibrant and diverse data science community.  She focuses on developing new programs and events that engage the wider data science community by strengthening partnerships and integrating research efforts among a wide array of disciplines and departments across campus and beyond.  Marsha has managed communications, training/education/outreach programs and administrative operations for scientific programs and research initiatives at UC Berkeley and Lawrence Berkeley National Laboratory, including the Innovative Genomics Institute, the DOE Joint Genome Institute and Berkeley Lab's Advanced Light Source.  She holds an MA in philosophy and comparative religious studies, and a BA in classics, philosophy and mathematics.

Heather Haveman

Professor, Department of Sociology, UC Berkeley, Professor, Haas School of Business, UC Berkeley , BIDS Associate Faculty Director

Heather Haveman is a Professor of Sociology and Business at UC Berkeley. She holds a BA in history and an MBA (from the University of Toronto), and a Ph.D. in organizational behavior and industrial relations (from UC Berkeley).  Following positions at Duke University's Fuqua School of Business, Cornell University's Johnson Graduate School of Management, and Columbia University's Graduate School of Business, Professor Haveman joined UC Berkeley in July 2006. Her research interests include how organizations, the fields in which they are embedded, and the careers of their members and employees evolve. Her current work involves American magazines and wineries, Chinese listed firms, and the emerging marijuana market in several US states. 

Ciera Martinez

Biodiversity and Environmental Sciences Lead

BIDS Biodiversity and Environmental Sciences Lead Ciera Martinez focuses on data intensive research projects that aim to understand how life on this planet evolves in reaction to the environment and climate – especially projects involving large and complex datasets.  A long-time open science advocate, Ciera has been involved with and continues to be interested in working on training for open data, education, publishing, and software, including developing community standards for data management practices.  As a 2019 Mozilla Open Science Fellow, she connected her love of data and museums and worked on projects aimed at understanding and increasing the usability of biodiversity and natural history museum data.  She received her PhD in Plant Biology from UC Davis, researching the genetic mechanisms regulating plant architecture.  She then went on to become a NSF Postdoctoral Fellow at UC Berkeley in the Molecular and Cellular Biology Department, studying genome evolution.  She was also a BIDS postdoctoral Data Science Fellow for 3 years, working on undergraduate research practices, data science training, community development, and best practices for data science, diversity and inclusion, and computational research.  

David Mongeau

Former Executive Director

David Mongeau, now the Founding Director of the School of Data Science at the University of Texas at San Antonio, was the Executive Director of BIDS from April 2018 to June 2021. During that time, in collaboration with the Faculty Director and Faculty Council, he set strategic direction and oversaw the BIDS research, training, and outreach. He also led the institute’s industry and foundation relations and its engagement with other UC and global research institutes, all toward the overarching mission at BIDS to create and deploy data science methods, practices, and technologies to enable discovery. Previously, David co-led the data analytics institute at Ohio State; worked at Battelle, where he championed its proposal for an AI and cybersecurity company, now Covail; and worked for many years at Bell Labs – starting on the team that introduced the first C++ compiler and UNIX System V and leaving after building a global business and technology consulting practice, now part of Nokia Bell Labs Consulting. David earned his undergraduate degree at Carnegie Mellon University, and later earned a graduate degree at Rensselaer Polytechnic Institute and an MBA from Purdue University. Many of his interests lie beyond data science, embracing the humanities and arts.

Maryam Vareth

Health and Life Sciences Lead, Co-Director, Innovate For Health initiative, Affiliated Researcher, Radiology and Biomedical Imaging, UCSF School of Medicine, Affiliated Researcher, Data Science Institute, Lawrence Livermore National Laboratory

Maryam Vareth leads BIDS’ data science research efforts in the Health & Life Sciences.  Dr. Vareth is a Co-Director of the Innovate For Health initiative, a collaboration among UC Berkeley, UCSF, and Janssen Pharmaceutical Companies of Johnson & Johnson. As an experienced engineer, researcher, and data scientist, she applies mathematics, statistics and physics to solve unmet needs in healthcare to enhance patients’ experience during their medical journey. She is an advocate for “data-driven” medicine, and in particular for linking medical imaging data with medical diagnostics and therapeutics to extract clinically-relevant insights through the use of open research and open source practices.  Dr. Vareth received her BS and MS training in Electrical Engineering and Computer Science (EECS) from UC Berkeley, where she was awarded the prestigious Regent’s and Chancellor’s Scholarship.  She completed her PhD through the joint UC Berkeley-UCSF Bioengineering program as a National Science Foundation Fellow, where she was awarded the Margaret Hart Surbeck Endowed Fellowship for Interdisciplinary Research for her work on developing new techniques and algorithms for the acquisition, reconstruction and quantitative analysis of Magnetic Resonance Spectroscopy Imaging (MRSI), with the goal of improving its speed, sensitivity and specificity to improve the management of patients with brain tumors.  She conducted her post-doctoral fellowship at UCSF, combining structural, physiological and metabolic imaging data from large clinical trials to quantitatively characterize heterogeneity within malignant brain tumors.

Niek Veldhuis

Professor of Assyriology, Department of Near Eastern Studies, UC Berkeley, BIDS Faculty Council Member

Niek Veldhuis is Professor of Assyriology (cuneiform studies) in the Department of Near Eastern Studies. He received his PhD at the Rijksuniversiteit Groningen (The Netherlands) in 1997, and came to Berkeley in 2002. His primary interests are in the intellectual history of ancient Mesopotamia (History of the Mesopotamian Lexical Tradition, 2014) and Sumerian literature (Religion, Literature and Scholarship: The Sumerian Composition Nanše and the Birds, 2004).  He is director of the NEH-supported Digital Corpus of Cuneiform Lexical Texts and is a member of the international Oracc Steering Committee, providing tools and standards for digital publication of cuneiform texts to scholars worldwide. Today, his main research focus is on developing computational text analysis scripts (primarily in Jupyter Notebooks) for cuneiform datasets.

David Bamman

Assistant Professor, School of Information, UC Berkeley, BIDS Faculty Affiliate

David Bamman is an assistant professor in the School of Information at UC Berkeley, where he works on applying natural language processing and machine learning to empirical questions in the humanities and social sciences. His research often involves adding linguistic structure (e.g., syntax, semantics, coreference) to statistical models of text, and focuses on improving NLP for a variety of languages and domains (such as literary text and social media). Before Berkeley, he received his PhD in the School of Computer Science at Carnegie Mellon University and was a senior researcher at the Perseus Project of Tufts University.

Chris Kennedy

BIDS Alum – Data Science Fellow

Chris Kennedy is now a postdoctoral fellow in biomedical informatics at Harvard Medical School, focusing on deep learning and causal inference in Gabriel Brat’s surgical informatics lab. He has a PhD in biostatistics from UC Berkeley. He is a senior fellow at UC Berkeley’s D-Lab and is affiliated with the Integrative Cancer Research Group and the Division of Research at Kaiser Permanente Northern California. At BIDS, he was a BIDS - Biomedical Big Data Training (BBDT) Data Science Fellow and a PhD student in biostatistics at UC Berkeley, where he worked with Alan Hubbard. He was also a D-Lab instructor and consultant, and an NIH biomedical big data trainee. His methodological interests encompassed targeted machine learning, randomized trials, causal inference, deep learning, text analysis, signal processing, and computer vision. His applications were primarily in precision medicine, public health, genomics, and election campaigns. His software projects included the SuperLearner ensemble learning system and varImpact for variable importance estimation; he leverages high performance computing on Savio and XSEDE clusters to accelerate his work. Prior to Berkeley he worked in political analytics in DC, running dozens of randomized trials and integrating machine learning into multi-million dollar programs to improve voter turnout for underrepresented Americans. He has also worked to support climate change action through Al Gore’s Climate Reality Project and the Yale Program on Climate Change Communication. He holds an M.A. in political science from UC Berkeley, an M.P.Aff. from the LBJ School of Public Affairs, and a B.A. in government & economics from The University of Texas at Austin.

Laura K. Nelson

BIDS Alum – Data Science Fellow

Former BIDS Data Science Fellow Laura K. Nelson is an Assistant Professor of Sociology in the College of Social Sciences and Humanities at Northeastern University. Laura uses computational methods and open source tools - principally automated text analysis - to study social movements, culture, gender, institutions, and organizations. She is particularly interested in developing computational tools that can bolster the way social scientists do inductive and theory-driven research. She received her PhD in sociology from the University of California, Berkeley, and she also holds an MA from UC Berkeley and a BA from the University of Wisconsin, Madison. While at UC Berkeley, she was a postdoctoral fellow with Digital Humanities @ Berkeley, developing a course for undergraduates on computational text analysis in the humanities and social sciences.