Computational Social Science Forum — A Supercomputer Reviews the Literature on Organizations: Combining Supervised and Unsupervised Text-Analysis Methods

CSS Training Program

September 28, 2020
12:00pm to 1:30pm
Virtual Participation


Computational Social Science Forum
Date: Monday, September 28, 2020
Time: 12:00-1:30 PM Pacific Time
Location: Register to receive the schedule and access links.

A Supercomputer Reviews the Literature on Organizations: Combining Supervised and Unsupervised Text-Analysis Methods 

Jaren Haber (Georgetown University) 
Heather A. Haveman (UC Berkeley) 
Yoon Sung Hong (Wayfair)

Abstract: Research in many academic fields requires reviewing the literature to determine what we know and don’t know.  For interdisciplinary fields, literature reviews are especially challenging because there are more, and more varied, publication outlets.  In this paper, we harness computational-science tools to review the literature in organizational theory, an interdisciplinary field that was developed primarily by sociologists and management scholars.  We focus on three perspectives that have dominated this field for the past four decades:  demographic, relational, and cultural.  We trace these three perspectives’ trajectories using journal articles from JSTOR.  We begin with supervised methods, to measure the literature’s engagement with each perspective over time, using expert-built dictionaries of terms derived from foundational texts in each perspective.  We then augment the dictionaries by unsupervised word-embedding models (word2vec) that analyze each term’s context to find similarly situated terms.  We validate results by measuring associations between our dictionary-based measures and measures based on citations to the foundational texts.  Preliminary results suggest surprisingly consistent engagement with all three perspectives for both disciplines over time.  However, engagement with the demographic and relational perspectives increases among management scholars, and the relational perspective has the highest engagement among both disciplines.  We discuss limitations of this approach and next steps.  

The Computational Social Science Forum is an informal setting for the interdisciplinary exchange of ideas and scholarship at the intersection of social science and data science. Weekly meetings are hosted by researchers from BIDS and D-Lab, and participants engage in a variety of activities such as presentations of work in progress, discussions and critiques of recent papers, introductions to new tools and methods, discussions around ethics, fairness, inequality, and responsible conduct of research, as well as professional development. We welcome social scientists researchers with interests in data science methods and tools, and data scientists with applications or interests in public policy, social, behavioral, and health sciences. Participants include graduate students, postdocs, staff, and faculty, and members are encouraged to attend regularly in order to foster community around improving computational social science research, supporting the development and research of group members, and fostering new collaborations. This Forum is organized as part of the Computational Social Science Training Program. Meetings are currently held virtually on Mondays at 12:00-1:30 PM Pacific Time, and interested UC Berkeley community members are invited to use this registration form to receive the schedule and access links. Please contact for more information.


Jaren Haber

MDI Fellow, Georgetown University

Heather A. Haveman

Professor, Department of Sociology, UC Berkeley

Heather A. Haveman is a Professor of Sociology and Business at UC Berkeley. She holds a BA in history and an MBA (from the University of Toronto), and a Ph.D. in organizational behavior and industrial relations (from UC Berkeley).  Following positions at Duke University's Fuqua School of Business, Cornell University's Johnson Graduate School of Management, and Columbia University's Graduate School of Business, Professor Haveman joined UC Berkeley in July 2006. Her research interests include how organizations, the fields in which they are embedded, and the careers of their members and employees evolve. Her current work involves American magazines and wineries, Chinese listed firms, and the emerging marijuana market in several US states. 

Yoon Sung Hong