BIDS welcomes this guest blog post from Jaren Haber, Jae Yeon Kim, and Nick Camp, the organizers of BAY-SICSS 2020:
_________
The first San Francisco Bay Area Summer Institute in Computational Social Science (BAY-SICSS) took place virtually this summer from June 15 to July 3, and we’re delighted with the results. We realized the BAY-SICSS vision of bringing together computational social scientists and practitioners for social good—proof it can be done. We hope this summary report provides a template for future BAY-SICSS organizers to continue this important work.
Organized by Jae Yeon Kim, Jaren Haber, and Nick Camp, this kick-off edition of BAY-SICSS was designed to accommodate the public health crisis and resulting shifts in format and research focus. To date, the Summer Institute in Computational Social Science (SICSS)—started in 2017 by Matthew Salganik (Princeton) and Chris Bail (Duke)—has trained more than 700 young scholars, bringing together students, postdoctoral researchers, and junior faculty for 2 weeks of intensive study and interdisciplinary research. BAY-SICSS is unique among the many SICSS partner locations for its focus on bringing together computational social scientists and practitioners for social good.
The Bay Area is a leader not only in technological but also social innovations, and the region is populated by big tech firms, startups, and also nonprofits. To take advantage of this, we wanted participants to engage with their communities and develop computational skills by applying them in partnership with local nonprofit organizations. To help computational social scientists network and develop research collaborations with practitioners, our institute was three weeks long instead of the usual duration of two weeks. We engaged 20 participants from various disciplines with five partner organizations (Code for America, DonorsChoose, Hopelab, UCSF NLP Community and PanaceaLab, UCSF Library) in a fully virtual institute.
You can read much more about our organizing process in our post-mortem. The focus here is on our results.
Our shared commitment to doing computational social science for social good in a time of crisis brought together a range of research domains: youth health (Hopelab), government services (Code for America), classroom support (DonorsChoose), and public discourse in crises past and present (No More Silence and COVID-19 Twitter). We were fortunate that our partner organizations–especially those that joined our cause early on–remained involved despite public health uncertainty and related challenges (remote workplaces, new responsibilities, etc.). In fact, instead of backing out due to the pandemic, they flipped the script: they collaborated with us to take a research angle on COVID-19. Code for America supported analyzing changes in food stamp applications to learn what communities are most in need. DonorsChoose focused with participants on differences within school districts in school funding and the spread of COVID-19. And the NLP/UCSF, No More Silence, and PanaceaLab teams brought into focus public dialogue about the current pandemic and its historical predecessors.
Here are the results that our partners describe from participating in BAY-SICSS, along with a few visuals:
DonorsChoose: “We’re very grateful to BAY-SICSS for providing the space for relationships to form between participants and partner organizations. We've continued to meet with Brian Kim (Ph.D. student in quantitative Education Policy at the University of Virginia), who collaborated with us during the institute and now is helping us think through our approach to analyzing school/student data and equity. He's helped the Virginia Department of Education with some similar questions, and we're finding his experience and perspective extremely helpful.”
To help DonorsChoose support more underprivileged schools, this interactive Shiny map, developed by Cheng Ren (Ph.D. student in Social Welfare with Designated Emphasis in Computational and Data Science and Engineering Program at UC Berkeley), provides a tool for exploring whether DonorsChoose funded schools with lower per-pupil expenditures. It includes three layers of demographic information for schools in 8 U.S. states: total population, number of COVID-19 cases, and neighborhood advantage index. In addition, each dot contains school-level information, including their relationship with DonorsChoose (e.g., number of projects and total funding) and per-pupil expenditures—both in absolute terms and relative to other schools in their state. By exploring the association between schools’ school district location and their demographic characteristics—including their degree of coronavirus infection—the map helps inform DonorsChoose as to what schools they are supporting, to aid them in targeting schools that need support.

Click on image for access to the online tool.
Hopelab: “We’ve been very gratified to see the kernel of collaboration we kicked off a year ago with the BAY-SICSS team grow into a rich network of data and social scientists and partner organizations. As we hoped, BAY-SICSS served as a generative springboard for several new areas for us, including bringing increased skills into our project work addressing the health and wellbeing challenges facing teens and young adults—particularly among those hardest hit by the social and economic impacts of COVID-19. We have continued to meet with our institute team members Terresa Eun (Ph.D. student in sociology at Stanford), Sherry Jiang (Ph.D student in psychology at UCSD), Meredith Meacham (Assistant Professor in the Department of Psychiatry and Behavioral Sciences at UCSF), and Krista Schnell (Ph.D. student in sociology at UC Berkeley) to advance their study of the effects of COVID-19 on social isolation, loneliness and resilience in adolescents’ online lives. We hope insights from this work will help Hopelab and others focused on improving health outcomes for young people to better identify emerging emotional health needs and target dynamic, innovative services and programs to reach those who can most benefit.”
Code for America: “We've continued to collaborate with Vanessa Böhm (postdoc at UC Berkeley and Lawrence Berkeley National Lab) and Chirag Modi (graduating Physics Ph.D. student at UC Berkeley) to examine who’s been most affected by the economic fallout of COVID. We'll use their work to create a new landing page on our microsite about why people need SNAP. Their work will help us examine the circumstances and unmet needs of people who have experienced common life events exacerbated by COVID—such as job or income loss, greater caretaking responsibilities, or having to move to a new place. Their work has been largely about clustering households affected by these life events to see what patterns emerge about who they are and what they need.”
As examples of the collaborative efforts with Code for America, here are two visuals analyzing applications for food assistance (SNAP) indicating an urgent food need. The first looks at how the composition of these applicants has changed before and after the start of the pandemic. Each of these two plots shows the fraction of applications that fall into one of two clusters: one cluster that shows a significant increase in fraction of applications (blue numbers), while the other shows a decrease in fraction of applications (red numbers). These models look beyond the general increase in the number of applications for food assistance (SNAP) to quantify the change in the applicant pool: for instance, the first plot shows an increasing fraction of applicants consisting of unemployed English speakers with stable housing.

Click on image to access at full size.
This next visual clusters these same applications, showing how the composition of these clusters changes from a simple two-cluster model (top row) down to 14 clusters (bottom row). The variables labeling the circles (in tiny text) are the distinctive characteristics of that cluster, while the edge labels indicate those variables that change the most as the cluster gets refined. The arrows show how applications get reassigned to clusters as the number of clusters increases, helping to identify the most defining properties of each cluster. The first two clusters are defined by job situation and language preference; these groups then split further according to special life events (e.g., pregnancy, lost job), family structure (e.g., young household members, older household members), and special needs (e.g. medical).

Click on image to access at full size.
_________
We learned a lot from organizing this first incarnation of BAY-SICSS, especially about leading virtual events (again, read about it in our post-mortem). We hope this blog post has proved to you that bringing together computational social scientists and practitioners for social good is not only possible, but also is a necessity for both parties and those who benefit from their work. If you want to apply as a participant for a future version of BAY-SICSS, keep an eye on the BAY-SICSS website for updates (or see past and partner locations). If you want to organize or partner with a future version of BAY-SICSS, email the organizing team at baysicss2020@gmail.com to get in touch.
Thanks for reading, and we hope you stay well and productive in COVID-times and beyond!
_________
Jaren Haber is a Postdoctoral Fellow at the Massive Data Institute at Georgetown University.
Jae Yeon Kim is a Ph.D. Candidate in Political Science and a D-Lab Senior Data Science Fellow at the University of California, Berkeley.
Nick Camp is an Assistant Professor in Organizational Studies and Psychology at the University of Michigan.