On March 23, 2021, 10:00 – 3:00 PM Pacific, BIDS GraphXD 2021 will convene data science experts from a variety of fields to promote interdisciplinary collaboration and training for researchers, scientists, and theorists interested in using graphs and network analysis for applications across domains. This year’s program will feature presentations from distinguished speakers and tutorials focused on NetworkX, the fundamental network analysis tool for creating and manipulating graphs and networks in Python. Presentations and tutorials will highlight this framework’s utility and diversity in a variety of applications. All researchers, scientists, and theorists interested in using graphs and network analysis are welcome and encouraged to attend this event.
This event will be presented on the platform Discord, an open access virtual platform to encourage ongoing participation and engagement across a global community of researchers. Participants can engage with this platform at any time to chat with colleagues and to build research collaborations. First, sign into the BIDS XD Community on Discord. If you’re new to the BIDS XD Community or the Discord platform, we encourage you to take a few minutes to sign in to get acclimated in the days leading up to the event. Then, on March 23, participants may explore the GraphXD “channels” and participate in a variety of ways:
— View the presentations and tutorials in presentations. This broadcast channel can become a pop-out window for wider-screen viewing while participants
— Ask questions and chat with presenters and other participants in q-and-a, and
— Engage in face-to-face discussions with presenters and others in the lounges (dijkstra, edmonds, hopcroft, kosaraju), where most of our speakers will be available to take your questions directly during the social/networking session at 2:00-3:00 PM Pacific.
Program and Abstracts
(Pacitfic Time Zone)
10:00 AM — Session 1: Graphs and Network Analysis
Welcome to BIDS and GraphXD
BIDS Executive Director
Exploring network structure, dynamics, and function with NetworkX
K. Jarrod Millman
Graduate Student, Division of Biostatistics, School of Public Health, UC Berkeley
Abstract: NetworkX is the reference library for network science algorithms in Python. Its creation and development was driven by research applications such as disease spread, cybersecurity, and measuring scholarly impact. Today it is a mature package that has a broad range of algorithms, a low barrier to entry (easy to learn), the ability to convert to many data formats, a simple design, good documentation that includes citations, and strong ties to the larger scientific Python ecosystem. I will briefly recount its history, describe its basic data model, provide an overview of features, discuss recent improvements, and invite you to contribute.
The Inequality of Intersectionality: An Empirical Example of Using "Small" Networks for Historical Insights
Laura K. Nelson
Assistant Professor of Sociology, Northeastern University
Abstract: Using the Chicago women’s movement as an example of local organizing during the first wave women's movement (1860-1920), I combine insights from historical narratives, a network analysis of connections between organizations, and word counts from the public-facing literature produced by local women’s movement organizations, to compare three different identities that intersected with gender in this movement: immigration, class, and race. I find that while the intersection of race and gender was crucial to this movement, race was never fully integrated into the movement in the way class and nativity was. In other words, the first wave women’s movement was foundationally intersectional, but not all of these intersections were treated equally in the mainstream of the movement. In this talk I use this empirical example to reflect on how network analytic techniques, made popular recently via their use on very large datasets, can be used on "small" data to produce important historical (and contemporary) insights. I end with a reflection on how to best collect and analyze "small" network data, with an attention to the potential effects of missing data as well as the ethical and privacy issues that are heightened when analyzing small networks.
— Break (~10 mins)
Network Measures for Complex Contagions
Assistant Professor of Management of Organizations, Berkeley Haas School of Business
Abstract: The standard measure of distance in social networks – average shortest path length – assumes a model of “simple” contagion, in which people only need exposure to influence from one peer to adopt the contagion. However, many social phenomena such as health behaviors, linguistic conventions, and novel technologies are “complex” contagions, in which people need to be exposed to social influence from multiple peers before they adopt. In this study, we show that the classical measure of path length fails to capture two key features of networks essential for studying the diffusion of complex contagions, namely connectedness and node centrality. As a result, the classical measure of path length frequently misidentifies the features of empirical social networks that are most effective for spreading complex social contagions. To address this issue, we derive new topological measures of complex path length and complex centrality, which accurately identify network connectedness, social distance and node centrality for the spread of complex contagions. We show that these measures enable significant improvements in the capacity to predictively identify the network structures, and the most central individuals, for increasing the spread of complex contagions. We test our theory using a canonical dataset on the spread of a microfinance program within 43 rural Indian villages. The findings show that complex path length and complex centrality outperform all standard measures of centrality based on the classical measure of path length.
Individual-level ecological networks and intraspecific variation
Assistant Professor, Institute of Ecology and Evolution, Data Science Initiative, University of Oregon
Introduction and Methods: Individuals vary extensively in their morphology and behavior, and this variation can have large impacts on species interactions. While this variation has the potential to affect a species’ interactions with all their partners (e.g., competitors, prey, predators), the interactions of a species with each of these communities are generally examined separately. Thus, how similar intraspecifc variation is across contexts and how it affects interactions with different types of partners is unknown. To examine whether species vary more in some of their interactions than others, we used DNA metabarcoding to identify the gut microbes and pollen species carried by ~1500 bees captured in California sunflower fields. Using this dataset, we compared the variability of bee individuals in their plant and microbe interactions. Given that flowers act as transmission hubs for bee pathogens and bacteria, we predicted that 1) variability in pollen interactions would be positively correlated with variability in microbe interactions. As many bacteria face large barriers to transmission and recruitment, we predicted 2) that pollen interactions would be more variable than microbe interactions. Finally, we predicted 3) that variability in microbial interactions would be lower when the plant-bee interaction networks were dominated by a few highly visited flower “hubs”.
— Break (~10 mins)
12:00 PM — Session 2: NetworkX Basics Tutorial
Introduction to Network Analysis with NetworkX
Mridul Seth, Research Software Engineer, GESIS Leibniz Institute for the Social Sciences; and Ross Barnowski, BIDS Scientific Software Developer
Abstract: Have you ever wondered about how those data scientists at Facebook and LinkedIn make friend recommendations? Or how epidemiologists track down patient zero in an outbreak? If so, then this tutorial is for you. In this tutorial we will introduce you to the fundamentals of network thinking. This tutorial is geared towards scientists who are capable of programming in Python using Python's built-in data structures, and are seeking a new computational skill in the analysis of graphs. No prior knowledge in graph theory is required. By the end of the tutorial, you will be able to create, manipulate, analyze, and visualize graphs using the NetworkX API. There will be enough exposure to the NetworkX API and documentation to help you jumpstart and further your learning journey into graph theory!
— Break (~10 mins)
1:00 PM — Session 3: NetworkXD Application Tutorial
Network Analysis of Ancient Sumerian Texts
Adam Anderson, BIDS Research Program Training Manager, and Niek Veldhuis, Professor of Assyriology, Department of Near Eastern Studies, UC Berkeley, and the Sumerian Networks Project Data Science Discovery Team
Abstract: In this demonstration, we will use NetworkX to help solve a riddle contained in a small administrative archive of cuneiform tablets from the ancient Sumerian city-state of Puzrish-Dagan, modern Drehem, Iraq (2100-2000 BC). The archive contains many records of the production of fine shoes, along with precious metals and gems, but why does this small collection of 300 texts exist among thousands of administrative records? To help answer this question, we use network analysis in order to map the relationships between the actors of this small archive, and visualize the social network to find the leaders and their cliques in the archive.
— Break (~10 mins)
2:00 – 3:00 PM — Session 4: GraphXD Community
For this social and networking hour, participants are invited to navigate to the Discord Lounge areas — dijkstra, edmonds, hopcroft, kosaraju — to engage directly in conversations and discussion with the speakers and other participants. Most of our speakers and presenters will be available in the Lounges during this session. Explore the lounges to find or to initiate a discussion that resonates with your interests and research.
- Adam Anderson, BIDS Research Training Program Manager
- Alexandre de Siqueira, BIDS Researcher
- Marsha Fenner, BIDS Communications/Program Manager
BIDS GraphXD (Graphs Across Domains) is a cross-domain initiative that promotes interdisciplinary collaboration and training for researchers, scientists, and theorists interested in using graphs and network analysis for applications in a variety of fields across STEAM including (but not limited to) anthropology, art, biology, computer science, economics, history, linguistics, mathematics, physics, and sociology.