Computational Social Science Forum — Using an online sample to estimate the size of an offline population

CSS Training Program

April 5, 2021
12:00pm to 1:30pm
Virtual Participation


Computational Social Science Forum
Date: Monday, April 5, 2021
Time: 12:00-1:30 PM Pacific Time
Location: Register to receive the schedule and access links.

Using an online sample to estimate the size of an offline population 

Speaker: Dennis Feehan, Assistant Professor, Demography, UC Berkeley 
Abstract: Online data sources offer tremendous promise to demography and other social sciences, but researchers worry that the group of people who are represented in online data sets can be different from the general population. We show that by sampling and anonymously interviewing people who are online, researchers can learn about both people who are online and people who are offline. Our approach is based on the insight that people everywhere are connected through in-person social networks, such as kin, friendship, and contact networks. We illustrate how this insight can be used to derive an estimator for tracking the digital divide in access to the Internet, an increasingly important dimension of population inequality in the modern world. We conducted a large-scale empirical test of our approach, using an online sample to estimate Internet adoption in five countries (n ≈ 15,000). Our test embedded a randomized experiment whose results can help design future studies. Our approach could be adapted to many other settings, offering one way to overcome some of the major challenges facing demographers in the information age.

The Computational Social Science Forum is an informal setting for the interdisciplinary exchange of ideas and scholarship at the intersection of social science and data science. Weekly meetings are hosted by researchers from BIDS and D-Lab, and participants engage in a variety of activities such as presentations of work in progress, discussions and critiques of recent papers, introductions to new tools and methods, discussions around ethics, fairness, inequality, and responsible conduct of research, as well as professional development. We welcome social scientists researchers with interests in data science methods and tools, and data scientists with applications or interests in public policy, social, behavioral, and health sciences. Participants include graduate students, postdocs, staff, and faculty, and members are encouraged to attend regularly in order to foster community around improving computational social science research, supporting the development and research of group members, and fostering new collaborations. This Forum is organized as part of the Computational Social Science Training Program, and interested UC Berkeley community members are invited to use this registration form to receive the schedule and access links. Please contact for more information.


Dennis Feehan

Assistant Professor, Demography, UC Berkeley