Harnessing Google Health Trends API Data: A How-to Guide for Epidemiologic Research

Berkeley Computational Social Science Forum

CSS Training Program

April 5, 2022
4:00pm to 5:00pm
Virtual Participation




Berkeley Computational Social Science Forum
Date: Tuesday, April 5, 2022
Time: 4:00-5:00 PM Pacific Time
Location: Virtual Participation – Register to attend via Zoom

Harnessing Google Health Trends API Data: A How-to Guide for Epidemiologic Research

Speaker: Krista Neumann, Berkeley Computational Social Science Fellow, BIDS and Berkeley School of Public Health
Abstract: Interest in using internet search data, such as that from the Google Health Trends Application Programming Interface (GHT-API), to measure epidemiologically relevant exposures or health outcomes is growing due to their accessibility and timeliness. Researchers input search term(s), geography and time period, and the GHT-API returns a scaled proportion of all searches within the specified geo-time period. In this study we detail a method for using these data to measure a construct of interest in five iterative steps: first, identify phrases the target population may use to search for the construct of interest; second, refine candidate search phrases with incognito Google searches to improve sensitivity and specificity; third, craft the GHT-API search term(s) by combining the refined phrases; fourth, test search volume and choose geographic and temporal scales; and fifth, retrieve and average multiple samples to stabilize estimates and address missingness. An optional sixth step involves accounting for changes in total search volume by normalizing. We present a case study examining weekly state-level child abuse searches during the COVID-19 pandemic as an application of this method and describe limitations.

The Computational Social Science Forum is an informal setting for the interdisciplinary exchange of ideas and scholarship at the intersection of social science and data science. Participants engage in a variety of activities such as presentations of work in progress, discussions and critiques of recent papers, introductions to new tools and methods, discussions around ethics, fairness, inequality, and responsible conduct of research, as well as professional development. This Forum is organized as part of the Computational Social Science Training Program, and weekly meetings are hosted by researchers from BIDS and D-Lab. The group welcomes social scientists and researchers with interests in data science methods and tools, and data scientists with applications or interests in public policy, social, behavioral, and health sciences. Participants include graduate students, postdocs, staff, and faculty, and members are encouraged to attend regularly in order to foster community around improving computational social science research, supporting the development and research of group members, and fostering new collaborations. Interested UC Berkeley community members are invited to use this registration form to receive the schedule and access links. Please contact css-t32@berkeley.edu for more information or if you are interested in presenting current research for an upcoming session.


Krista Neumann

PhD Student, Epidemiology & Biostatistics, School of Public Health, UC Berkeley

Krista Neumann is a PhD student in Epidemiology at Berkeley’s School of Public Health. Her goal is to apply her training in Mathematics to social epidemiological questions focused on reducing health disparities. Neumann’s current research interests aim to answer the causal question: which policies, programs and interventions most successfully reduce systematically embedded barriers to health for marginalized and low-income communities. She’s particularly interested in food and nutrition insecurity as a specific outcome, as well as ways to overcome obstacles which prevent evidence from affecting meaningful policy changes. Neumann holds a Bachelors of Mathematics from the University of Waterloo in Canada.