Exploring the timescales for hydrologic transport with data science tools

June 3, 2019

From among over 80 applicants, 19 participants representing 10 universities, governmental agencies and consulting companies from across the US and Canada were invited to attend the Critical Timescales of Hydrologic Transport Data Science Workshop, held at BIDS on May 22-24, 2019. This workshop brought together data scientists and watershed hydrologists looking to learn more about how to apply those skills to real-world hydrologic problems, and researchers trained in hydrology looking to learn new techniques for working with large hydrologic datasets and performing causal inference. The workshop was hosted by BIDS Senior Fellow Laurel Larsen and BIDS-LBNL Data Science Fellow Zexuan Xu — who received the BIDS Research Project Hydrological forecasting and the water/energy nexus along with BIDS Senior Fellow Fernado Perez — and also supported USGS Powell Center Working Group for Watershed Storage Control.

Laurel Larsen at the podium.
Photo credit: Liang Zhang.

“BIDS was an ideal venue for this workshop because of its initiative in efforts to bring best practices in data science to the earth and environmental sciences, first through the California Water Data Hackathon earlier in the academic year and now through the new data commons initiative.” said Laurel Larsen, an associate professor at the Department of Geography, acknowledging the support from BIDS. “BIDS helped us reach the right communities - including data-savvy hydrologists and earth science-savvy data scientists - needed to help us make this workshop a success, and it also helped us make exciting connections with BIDS fellows and the greater BIDS community that benefited all participants." 

This workshop was part training, part networking, part hackathon, and focused on applying multiple time-series analysis techniques, hydrologic modeling, and isotope tracer approaches to understand fundamental controls on the timescales over which water moves through watersheds to generate streamflow. These timescales are some of the greatest uncertainties in forecasting future streamflow and flood events. “Setting a data workshop requires designing demonstrative actual test cases, providing the necessary data, and developing the supporting codes and notebooks,” said Edom Moges, a postdoctoral scholar working with Laurel Larsen, who helped organize this workshop. A database relevant to understanding how watersheds respond to precipitation and climatic factors was compiled by the workshop organizers to enable participants to produce better forecasts of streamflow. Most of the data processing and analysis tools were developed in Jupyter Notebook and will be openly accessible to the public.

Zexuan Xu during a working session.
Photo credit: Liang Zhang.

The workshop participants formed teams to address specific problems and challenges, mainly focused on the data from two watersheds (HJ Andrews Experimental Forest, Oregon, and East River, Colorado supported by Oregon State University and LBNL’s Watershed Function Scientific Focus Area that are currently a focus for the development of predictive models and have extensive data records from sensor networks and isotope studies. The teams organized around the following topics:

  • Transfer Entropy: This team applied transfer entropy analysis for the East River catchment to understand causal relationships between discharge (i.e., streamflow) and predictor variables (e.g., temperature, precipitation). Dominant processes controlling streamflow were compared between wet and dry years.
  • Climate Change: This team developed machine learning models to predict stream flow with historical data, tested model performance and studied the impact of climate change using climate climate model projection data.
  • Timescales: Working with the HJ Andrews dataset, this group investigated dominant timescales over which information is transferred from meteorological variables to streamflow in sub-catchments and at larger spatial scales.
  • Isotope Analysis: This team estimated the fraction of “new” water found in streamflow pulses at the HJ Andrew watershed and travel-time distributions of water.

Fernando Perez at the podium.
Photo credit: Zexuan Xu.

On the first day of the workshop, participants were invited to present their own research and network with people. Then the participants formed teams to work on the projects based on their experience, expertise and objective for this workshop. “I found my group members by finding common interests we wrote on the cards and getting inspirations from effective communications.” said by Esther Xu, a PhD student from John Hopkins University studying geography and environmental engineering, “We performed ensemble unit hydrograph analysis, and also learned a lot from processing and handling the real-world data”. Jim Kirchner, professor from ETH-Zurich, Switzerland and a professor emeritus in the Dept. of Earth and Planetary Science, UC Berkeley, was invited to present his latest research on timescales of transport and hydrological response and also joined in the hackathon with participants and provided his insights to many projects. On Friday morning, Fernando Perez gave an informal overview of the Jupyter family and the development of geoscience related programs, including Binder, Pangeo, and Awesome Open Geoscience, which are all widely applied in earth science community and could greatly benefit the participants’ future research.

The workshop has been favourably reviewed by the participants. "What a great opportunity to use widely-used machine learning and information theory techniques specifically focused on problems in hydrology,” said Kealie Pretzlav, a hydrologist from Balance Hydrologics, Inc., a hydrology consulting firm located at Berkeley. “The hackathon and training format was integral in coming away from the workshop feeling like I could not only use some of these techniques, but apply them to a wider range of applications."

BIDS Senior Fellow and project lead Laurel Larsen and BIDS-LBNL 
Data Science Fellow Zexuan Xu with workshop participants at BIDS. 
Photo credit: Laurel Larsen.

“By working with an actual dataset, participants came to a solid appreciation of how pre-processing considerations common for earth-science datasets can impact outcomes, and provided valuable feedback on the emerging toolkit that we are developing.” Laurel Larsen commented after the participants presented their findings and discoveries in the end of the workshop. With the insights of actual datasets, participants will move forward and promote the tools and ideas circulated during the workshop to the broader scientific community. Several promising collaborations among participants and organizers appear to be a likely outcome of the three-day workshop. The datasets and tools developed from this workshop will become a key component in improving hydrologic forecasting through synthesis of critical storage components and timescales across watersheds worldwide.

Photos courtesy of Zexuan Xu, Laurel Larsen and Liang Zhang.

Featured Fellows

Laurel Larsen

Geography, UC Berkeley
Faculty Affiliate

Zexuan Xu

Hydrology, CERC-WET, Berkeley Lab

Fernando Pérez

Statistics, UC Berkeley; Data Science and Technology, LBNL
Faculty Affiliate