BIDS Senior Fellow Lauel Larsen and ESDL Project Scientist Dino Bellugi offer this project (#4) through UC Berkeley's Undergraduate Research Apprentice Program (URAP).
The Environmental Systems Dynamics Laboratory (ESDL) focuses on the interplay between biological, physical, and human aspects of the environment using a combination of physically-based and data-driven models. Research topics include how river deltas grow or shrink, how landslides occur and mobilize, how deforestation affects precipitation, and how to forecast the response of environmental systems under changing forcing scenarios. This internship aims to expand on our current work exploring the use of deep learning (DL) for environmental predictions.
DL methods often outperform other models (including physical ones) in making environmental predictions but are often used as a “black box”, reducing our ability to gain insight into the physical processes involved. For example, Long-Short-Term-Memory (LSTM) networks are extremely effective in making river streamflow predictions, even in watersheds that are snow-dominated, as they can capture the lags between the forcing and response variables. Unlike a physical model, the LSTM does not know that in the winter precipitation turns to snow and does not become streamflow until the melting season. Yet, it learns from data that the system has a memory, and is able in many cases to generate accurate streamflow predictions, based on precipitation and temperature time series. In such cases the state variables indeed track observed snow measurements, even though these have not been provided to the LSTM as input variables. This suggests that the internal states of a trained LSTM represent hydrologic processes that control streamflow, and they can be identified by their correspondence to independent, and collocated observational datasets that the model has not seen. Thus, analyzing the LSTM state variables could provide insight on how the response may change under different climatic regimes, as well as the capability of approximating basin-wide variables that are not measured in many watersheds.
In addition, we seek to introduce physical constraints (such as water balance) to the LSTM, by modifying the optimization loss function and/or by including process-based model outputs among the input variables. This will enable the improvement of streamflow prediction, particularly in non-stationary conditions where out-of-sample data are more frequent, as well as a more robust generalization to other watersheds where data measurements are more sparse.
Similar applications include the prediction of soil moisture, evapo-transpiration, solute concentration, and subsurface pore water pressure. In addition to generating good predictions, we would like to learn how these response times change across time and space. We also want to explore how transferable DL methods are across different landscapes or climate gradients, as transferability is essential in developing larger scale models that can be trained concurrently on many different watersheds in different climatic and topographic settings. Finally,we want to explore how introducing physical constraints using physics-based loss functions and hybrid data-driven and process-based models can aid generalization and performance in non-stationary conditions.
Participating URAP interns will work with a variety of time series data from intensely monitored Critical Zone observatories, as well as from state and national datasets discharge and precipitation. The student will work collaboratively to develop DL models and to interpret the LSTM state variables, and their relative importance.
Fall 2022 BIDS Undergraduate Internships - Apply August 17-29
Spring 2022 BIDS Undergraduate Internships - Apply January 18-24
Fall 2021 BIDS Undergraduate Internships - Apply August 18-30
Spring 2021 BIDS Undergraduate Internships - Apply January 12-25
Fall 2020 BIDS Undergraduate Internships - Apply August 19-31