Project Jupyter is a community of open-source developers, scientists, educators, and data scientists. Its goal is to build open-source tools and create a community that facilitates scientific research, reproducible and open workflows, education, computational narratives, and data analytics. Jupyter supports over 100 programming languages, and connects data analytics tools across a range of disciplines and communities. In 2001, Fernando Pérez (current BIDS Faculty Director) started IPython, one of the foundational tools for analyzing large amounts of data in a transparent and collaborative way, which has now evolved into Project Jupyter.
In March 2024, Project Jupyter received a special award from the White House Office of Science and Technology Policy (OSTP).
There are several core projects of Jupyter that the Berkeley Institute for Data Science supports:
JupyterHub provides remote access to Jupyter servers on shared infrastructure, with the goal of making high-powered computational environments and resources more accessible to students, researchers, and collaborators. JupyterHub runs in the cloud or on your own hardware, and makes it possible to serve a pre-configured data science environment to any user in the world. It is used in education and large-scale courses as well as in collaborative and massively-open data analytics projects.
Jupyter’s next-generation interface, JupyterLab empowers data scientists to compose the interface that suits their needs. It is a flexible and extensible user interface meant to support the diversity of workflows in data science. JupyterLab runs via the same Jupyter server as the traditional Notebook interface, which allows it to be accessed remotely on shared infrastructure (for example, via a JupyterHub).
Jupyter Notebooks are a web-based interactive computing platform and open document format that allows users to author computational narratives that combine live code, equations, narrative text, interactive user interfaces, and other rich media. The Jupyter Notebook enables the collaborative creation of reproducible computational narratives that can be used across a wide range of audiences and contexts, and can be used in any data science workflow.
Binder allows users to create sharable, interactive, reproducible coding environments from materials that they put in online repositories like GitHub. Binder’s goal is to lower the barrier to sharing your scientific work, distributing educational materials, and communicating your work in an interactive fashion. It’s both free and open-source technology that others can deploy in the cloud, as well as a public service that hosts nearly 7,000 daily sessions at mybinder.org.