Project Jupyter: Architecture and Evolution of an Open Platform for Modern Data Science

Berkeley Distinguished Lectures in Data Science

Project Jupyter, evolved from the IPython environment, provides a platform for interactive computing that is widely used today in research, education, journalism and industry. The core premise of the Jupyter architecture is to provide tools for human-in-the-loop interactive computing. It provides protocols, file formats, libraries and user-facing tools optimized for the task of humans interactively exploring problems with the aid of a computer, combining natural and programming languages in a common computational narrative.

I will discuss both how Jupyter was deliberately designed with an open architecture, standards and community to support both technical and human goals of the project. Open protocols and standards have helped unify workflows focused on computation and communication across programming languages, a layered architecture encourages innovation and industry adoption at various levels, and an open community engages stakeholders from high-school education to cutting-edge scientific research. I will conclude with a review of new developments, including major contributions from the Berkeley community.

The Berkeley Distinguished Lectures in Data Science, co-hosted by the Berkeley Institute for Data Science (BIDS) and the Berkeley Division of Data Sciences, feature faculty doing visionary research that illustrates the character of the ongoing data, computational, inferential revolution. All campus community members are welcome and encouraged to attend.  Arrive at 3:30pm for tea, coffee and discussion prior to the formal presentation.


Fernando Perez

Assistant Professor, Statistics Department

Fernando Pérez is an assistant professor in Statistics at UC Berkeley and a Faculty Scientist in the Department of Data Science and Technology at Lawrence Berkeley National Laboratory. After completing a PhD in particle physics at the University of Colorado at Boulder, his postdoctoral research in applied mathematics centered on the development of fast algorithms for the solution of partial differential equations in multiple dimensions.  Today, his research focuses on creating tools for modern computational research and data science across domain disciplines, with an emphasis on high-level languages, interactive and literate computing, and reproducible research.  He created IPython while a graduate student in 2001 and co-founded its successor, Project Jupyter. The Jupyter team collaborates openly to create the next generation of tools for human-driven computational exploration, data analysis, scientific insight and education.

He is a National Academy of Science Kavli Frontiers of Science Fellow and a Senior Fellow and founding co-investigator of the Berkeley Institute for Data Science.  He is a co-founder of the NumFOCUS Foundation, and a member of the Python Software Foundation. He is the recipient of the 2012 Award for the Advancement of Free Software from the Free Software Foundation.