Challenges of Doing Data-Intensive Research in Teams, Labs, and Groups: Report from the BIDS Best Practices in Data Science Series

R. Stuart Geiger, Dan Sholler, Aaron Culich, Ciera Martinez, Fernando Hoces de la Guardia, Francois Lanusse, Kellie Ottoboni, Marla Stuart, Maryam Vareth, Nelle Varoquaux, Sara Stoudt, Stefan van der Walt

SocArXiv
November 14, 2018

Abstract: What are the challenges and best practices for doing data-intensive research in teams, labs, and other groups? This paper reports from a discussion in which researchers from many different disciplines and departments shared their experiences on doing data science in their domains. The issues we discuss range from the technical to the social, including issues with getting on the same computational stack, workflow and pipeline management, handoffs, composing a well-balanced team, dealing with fluid membership, fostering coordination and communication, and not abandoning best practices when deadlines loom. We conclude by reflecting about the extent to which there are universal best practices for all teams, as well as how these kinds of informal discussions around the challenges of doing research can help combat impostor syndrome.

Recommended citation: R. Stuart Geiger, Dan Sholler, Aaron Culich, Ciera Martinez, Fer- nando Hoces de la Guardia, François Lanusse, Kellie Ottoboni, Marla Stuart, Maryam Vareth, Nelle Varoquaux, Sara Stoudt, and Stéfan van der Walt. "Challenges of Doing Data-Intensive Research in Teams, Labs, and Groups." BIDS Best Practices in Data Science Series. Berkeley Institute for Data Science: Berkeley, California. 2018. doi:10.31235/osf.io/a7b3m



Featured Fellows

R. Stuart Geiger

Ethnographer

Ciera Martinez

Molecular and Cell Biology

François Lanusse

Berkeley Center for Cosmological Physics, FODA Institute

Kellie Ottoboni

Statistics
Alumni - DATA SCIENCE FELLOW

Maryam Vareth

Health & Life Sciences Lead

Nelle Varoquaux

Statistics
Alumni - DATA SCIENCE FELLOW

Sara Stoudt

Statistics

Stéfan van der Walt

Senior Research Data Scientist