BIDS Machine Shop

BIDS Senior Research Data Scientist Stéfan van der Walt previously hosted these undergraduate projects through BIDS Undergraduate Internships Program.

As more scientific fields move to intersect with computation, a need arises for software tools that can bridge the gap between the matter under investigation and computational principles/software engineering. Many scientists are specialists trained in their respective domains, so finding contributors with the necessary practical experience to implement computational tools—be it for statistical analysis, data wrangling, machine learning, visualization, or data management—can be difficult.  The aim of the BIDS Machine Shop is to serve the scientific community on the UC Berkeley campus by providing human resources to close this gap.

A lab (or scientist) brings a software request coupled with a domain problem to the Machine Shop. Requests should envision the development of a minimal viable product of a software tool that fills a specified niche with an estimated development time of one to three months.  The Machine Shop reviews proposals and allocates resources as available to develop a proof-of-concept tool. The development takes place in collaboration with one or more representatives from the lab (perhaps graduate students under the principle investigator) who will act as liaisons and collaborators and will be available to the team for domain-specific questions. The team develops a prototype through continuous feedback from and interaction with the liaison, which is eventually released under a permissive open source license. Optionally, the team/lab also writes a short report on the tool and publishes it on an open access platform, such as Arxiv.

Researchers interested in incubating a project with the BIDS Machine Shop should get in touch with Stéfan van der Walt.

Previous BIDS Machine Shop Sub-Projects 

  • Butterfly analysis: with the museum of natural history in London, we are writing software to aid in the automated analysis of butterfly speciments, via the analysis of digital photographs. This semester, we are investigating convolutional neural network alternatives to existing segmentation models. (Technologies: PyTorch, scikit-image, NumPy.)
  • SkyPortal: a web platform for the ingestion and display of data from the Large Synoptic Survey Telescope, which is coming online in 2020. (Technologies: Python, JavaScript + TypeScript, Kafka, Bokeh, Dask, Tornado, nginx).
  • Redesign of the NumPy homepage: the website for NumPy, one of the fundamental computational libraries of the scientific Python ecosystem, hasn't had a revamp in over a decade. We need to bring it in line with modern standards. (Technologies: Hugo, CSS, JavaScript)

Projects are run on the open source model, where all team members work together on solving problems, reviewing one another's code contributions, and iterating rapidly to produce viable software libraries.

Apprentices will be part of the team developing a computational tool and learn about Python libraries such as numpy, scipy, scikit-image & scikit-learn, practical e engineering principles, and how to contribute to a collaborative software project.

Qualifications: These projects all require advanced programming in Python. Some computer science & computational background will be helpful. Working knowledge of git is essential (making branches, merging conflicts, submitting pull requests on GitHub). In order to qualify for the program, you will have to show proof of existing high quality code published online (e.g., GitHub PRs, an online code repository, a patch to a mailing list, your own project, etc.) We require 9--12 hours per week. Please do not apply if your schedule is already stretched.