This summer, I had the unique opportunity to intern at the Berkeley Institute for Data Science (BIDS). It was an eye-opening experience that taught me much about open source software.
Having just graduated from UC Berkeley and taken a few data science courses, I was already familiar with Python and its many libraries. However, I had never been exposed to the inner workings of an open source project: the infrastructure, maintainers, and community efforts that allow Python to power countless projects worldwide.
Under the guidance of BIDS Executive Director Kirstie Whitaker, I worked on a variety of challenging, hands-on projects. In this final blogpost, I would like to share three impactful lessons I learned about open source.
First, choosing to open-source a project is a deliberate decision. It requires effort to set up, but in return, it will encourage new perspectives and meaningful innovation. I saw this firsthand while staffing the 2025 National Workshop on Data Science Education. At a BIDS-hosted panel featuring pioneers, such as BIDS Faculty Director Fernando Pérez, I was amazed at the range of open-source initiatives on display. One highlight was JupyterHub, a tool co-created by Professor Pérez, that I had used often without a second thought. Through the panel, I learned that JupyterHub is particularly valuable in education as it runs directly in the browser, lowering barriers for students to learn programming. Because it is open source, it also supports various forks and add-ons, such as the AI assistant demoed by Pérez (which can sit inside a Jupyter Notebook!). This flexibility extends JupyterHub into other fields, such as geography and public health. Later, I also supported a panel by preparing a short live demo for Kirstie Whitaker, which showcased how to contribute, review, and accept a pull request. Seeing it on-screen was especially exciting. After the event, I wrote an impact post featuring a panel of four educators piloting Data 6, a new introductory data science course. Since the course materials are open source, not only can educators improve them collectively, but also adapt and “fork” smaller versions to fit local teaching environments.
Second, maintaining an open-source project is far more involved than I ever imagined. One of my long-term projects was contributing to The Turing Way, an open-source book that promotes open science, reproducibility, and best practices in open-source development. My role was to curate existing educational content and integrate it into various chapters. For example, I added substantial material to the "Motivation for Using Github" page.
To most readers, The Turing Way might appear as a polished, informational resource. But behind the scenes, a large amount of coordination and care is required to make it all possible. Maintainers hold regular video calls, bringing together contributors from around the world. Despite differences in time zones and locations, they collaborate to create a resource that benefits the common good. New changes are continuously being discussed and accepted; the repository has received over 2,800 total pull requests! The book also includes a section designed to onboard new contributors, offering style guides, templates, and step-by-step workflows. This level of community support and infrastructure makes the project itself seem alive; it is constructive and encouraging, without restricting or stifling new contributions. I realized that effective project management, though often invisible to end users, is what makes sustainable, high-quality open-source development possible.
My final takeaway is that open-source projects cultivate vibrant, lasting communities. I had the opportunity to write an impact story for the 2025 SciPy Developer Summit. Working with Jarrod Millman and Stéfan van der Walt, I wrote about the implementation of SciPy sparse arrays into the Python Spatial Analysis Library (PySAL). As I learned about the summit and attended meetings with about 40 core maintainers, I was especially struck by the group's engaging and supportive dynamic. The depth of discussion, mixed with friendly banter, made it clear that their shared camaraderie extended well beyond just the project. Many contributors juggle multiple professional responsibilities, yet they all consistently make time to meet and collaborate. Throughout my time at BIDS, I often saw familiar names reappearing across my different projects, giving me a sense of continuity and connectedness. I realized that open source isn't just about the code being written. It is also about people coming together to advance a shared vision—that a project can evolve hand-in-hand with its community.
Photo: (from left) Kyle Cheng, Tyler Hawthorne, Jamilah Karah (credit: Dee Rossiter)
The internship allowed me to work closely with my colleagues Kirstie Whitaker, Adrian Hill, Lilli Wessling Hart, Jamilah Karah, and Tyler Hawthorne. Working alongside them was inspiring and I am grateful for such meaningful projects. I have learned so much from our talks and deeply appreciate that they have made me a part of the open-source community.
Thank you!