Software

Jupyter

Project Jupyter is a community of open-source developers, scientists, educators, and data scientists. Its goal is to build open-source tools and create a community that facilitates scientific research, reproducible and open workflows, education, computational narratives, and data analytics. Jupyter supports over 100 programming languages, and connects data analytics tools across a range of disciplines and communities. In 2001, Fernando Pérez (current BIDS Faculty Director)...

SkyPortal

SkyPortal is a fully open-source data portal for the collaborative study and management of time-domain sources and events. It interactively displays astronomical datasets for annotation, analysis, and discovery, and is designed to be modular and extensible, so it can be customized for various scientific use-cases. BIDS Faculty Affiliate Joshua Bloom and BIDS Senior Research Data Scientist ...

US Research Software Sustainability Institute (URSSI)

This project is conceptualizing a US Research Software Sustainability Institute (URSSI) that will focus on the entire research software ecosystem — including the people who create, maintain, and use research software — to validate and address various classes of concerns impacting all software development and maintenance projects across all of NSF. BIDS Senior Research Data Scientist Karthik Ram leads this project.

The proposed long-term goals of the institute could include:

Help research...

SciPy

SciPy is an open-source scientific computing library for the Python programming language. Since its initial release in 2001, SciPy has become a de facto standard for leveraging scientific algorithms in Python, with over 600 unique code contributors, thousands of dependent packages, over 100,000 dependent repositories and millions of downloads per year. It is widely used by researchers across academia and industry, and has been used in the production of some major scientific results such as the LIGO gravitational wave detection, and the recent imaging of a black hole by the Event Horizon...

scikit-image

Scikit-image is a community-driven Python project, consisting of a vast collection of high-quality, peer-reviewed image processing algorithms that are made available to a global community of researchers free of charge and free of restriction. The library is widely used in many different fields, including astronomy, biomedical imaging, and environmental resource management. Scikit-image was founded by BIDS Research Data Scientist Stéfan van der Walt in 2009.

NumPy

NumPy is the fundamental array package underpinning the Scientific Python ecosystem. BIDS hosts a team of four core developers that work with the NumPy community to develop the library in preparation for the next decade of data science.

NumPy contains, among other things, the following:

A powerful N-dimensional array object Sophisticated (broadcasting) functions Tools for integrating C/C++ and Fortran code Useful linear algebra, Fourier transform, and random number capabilities

Besides its obvious scientific uses, NumPy can also be used as an...

rOpenSci

rOpenSci is a scientific open source project whose primary mission is to promote development and use of high-quality research software in the research community. The rOpenSci team and community enable this transformation by training domain experts in good software development practices and fostering a peer review culture for research software. Simultaneously the project is also responsible for building robust software to help researchers access, discover, publish and work with disparate types of scientific data. rOpenSci plays a critical role in the scientific software ecosystem,...

Mothra

Mothra analyzes images of butterflies and measures their wing lengths. Using binarization techniques and calculating the resolution of ruler ticks, we read in images of butterflies and output the millimeter lengths of their wings.

The pipeline script combines four modules to analyze an image: ruler detection, binarization, tracing, and final measurement. These modules are located in /butterfly . Python module requirements are listed in requirements.txt .

Run the pipeline.py file with the arguments to read in raw images...

Cesium

Cesium is an end-to-end machine learning platform for time-series, that computes machine learning features, builds models, and does prediction. Cesium has two main components—a Python library, and a web application platform that allows interactive exploration of machine learning pipelines. The Cesium library is specifically designed to handle irregularly sampled time series, as is common in astronomy. BIDS Faculty Affiliate Joshua Bloom and BIDS Senior Research Data...

NetworkX

BIDS hosts several core developers working with NetworkX, a Python package to create, manipulate, and study the structure, dynamics, and functions of graphs and networks.

NetworkX is the reference library for network science algorithms in Python. Its creation and development was driven by research applications such as disease spread, cybersecurity, and measuring scholarly impact. Today it is a mature package that has a broad range of algorithms, a low barrier to entry (easy to learn...