Computer Science


Project Jupyter is a community of open-source developers, scientists, educators, and data scientists. Its goal is to build open-source tools and create community that facilitates scientific research, reproducible and open workflows, education, computational narratives, and data analytics. Jupyter supports over 100 programming languages, and connects data analytics tools across a range of disciplines and communities.

There are several core projects of Jupyter that the Berkeley Institute for Data Science supports:


Scikit-image is a community-driven Python project, consisting of a vast collection of high-quality, peer-reviewed image processing algorithms that are made available to a global community of researchers free of charge and free of restriction. The library is widely used in many different fields, including astronomy, biomedical imaging, and environmental resource management. Scikit-image was founded by BIDS Research Data Scientist Stéfan van der Walt in 2009.

Data Science for Hard Core Humanists: Opportunities and Challenges from Computational Assyriology


Participants in this project work on the core data science software stack for the julia programming language, building the equivalent of pandas on Python and the tidyverse on R for the julia programming language, and covering DataFrame types, query languages, file IO, distributed query execution, database connections, plotting, data visualization, etc. This project focuses on both usability and high performance with the goal to create the next generation platform for big data science work.