NumPy: array programming at the core of the scientific Python ecosystem

September 16, 2020

In a new paper in Nature, members of the NumPy development team -- including BIDS’ Stéfan van der Walt, Jarrod Millman, Sebastian Berg, Nathaniel SmithMatti Picus, and Tyler Reddy -- take modern data scientists on a complete tour of NumPy array programming, from its origins as a small community project, to its emergence as the foundation of a vibrant ecosystem of data analysis tools that now span an increasingly broad range of research domains and applications.  

NumPy logoNumPy has now become core scientific infrastructure: a powerful open-source library whose n-dimensional array data structure underpins almost every Python library for scientific or numerical computation, including SciPy, Matplotlib, pandas, scikit-learn and scikit-image. NumPy’s array programming foundation, along with the versatility of tools that have become available within the scientific Python ecosystem, creates a versatile and interactive environment for exploratory data analysis and data-intensive research. 

2020-0916 - NumPy - Nature - Fig2 - NumPy at the core of the scientific Python ecosystem
NumPy is the foundation of the scientific Python ecosystem:
NumPy's n-dimensional array is the fundamental data structure of the scientific Python ecosystem. NumPy also provides means for improving interoperability with other array and analysis libraries. Click on image to access at full size.


The NumPy project has endeavored to democratize scientific software by empowering students and researchers to participate in scientific research that would otherwise be reserved only for those who can afford to engage with more expensive computing environments. Now, it continues to keep pace with the changing landscape of data science by empowering users and establishing protocols that optimize accessibility and efficiency, facilitate interoperability and coordination, diversify utility, simplify adoption, and enable computation and deployment at scale. 

BIDS’ Senior Research Data Scientist Stéfan van der Walt and NumPy Software Developers Sebastian Berg and Ross Barnowski are currently part of the core NumPy development team, which at any given time has a total of only 10 active maintainers, who support an estimated 10-15 million NumPy users worldwide. A fully volunteer effort until 2018, NumPy development is now being funded by the Moore and Sloan Foundations, and by an award from the Chan Zuckerberg Initiative as part of the Essentials of Open Source Software program. Continued and consistent sources of funding will be crucial for maintaining and expanding the scientific Python ecosystem.

In the coming years, the numbers − and varieties − of data science practitioners will continue to expand, and NumPy will rely on the next generation of graduate students and community contributors to cultivate practices that unify, diversify and integrate its community membership, its functionality and its research applications.   

Array programming with NumPy
September 16, 2020  |  Nature
Charles R. Harris, K. Jarrod Millman, Stéfan J. van der Walt, Ralf Gommers, Pauli Virtanen, David Cournapeau, Eric Wieser, Julian Taylor, Sebastian Berg, Nathaniel J. Smith, Robert Kern, Matti Picus, Stephan Hoyer, Marten H. van Kerkwijk, Matthew Brett, Allan Haldane, Jaime Fernández del Río, Mark Wiebe, Pearu Peterson, Pierre Gérard-Marchant, Kevin Sheppard, Tyler Reddy, Warren Weckesser, Hameer Abbasi, Christoph Gohlke & Travis E. Oliphant

Updated September 24, 2020

Featured Fellows

Stéfan van der Walt

Senior Research Data Scientist

Sebastian Berg

Scientific Software Developer (NumPy)

Ross Barnowski

Scientific Software Developer (NumPy)

Jarrod Millman

Biostatistics, UC Berkeley
Alumni - BIDS Data Science Fellow

Nathaniel Smith

Former Computational Fellow

Matti Picus

Alumni - Scientific Software Developer

Tyler Reddy

Alumni - Scientific Software Developer (NumPy)