Interpretable Unsupervised Learning for Molecular Dynamics

Berkeley Statistics and Machine Learning Forum

Forum

December 16, 2019
1:30pm to 2:30pm
190 Doe Library
Get Directions

Register

Talk Title: A family of algorithms for interpreting manifold embedding coordinates in molecular dynamics data

Abstract: Unsupervised learning methods are commonly used in analysis of molecular dynamics data to discover the so-called molecular manifold corresponding to low-energy paths between molecular configurations. However, understanding the role of covariates such as bond rotation in determining the energy landscape is made difficult by non-trivial data topology and geometry, and interpreting these dynamics in the latent-state representation is often done visually. I will present a flexible family of parametric dictionary-based methods that replace or augment existing unsupervised learning methods by providing approximations that are interpretable with respect to a predefined dictionary of functional covariates. Our algorithms make use of a novel application of group lasso to sparsely align interpretable and latent space gradients with respect to the data manifold. We demonstrate the effectiveness of these methods on quantum-chemical simulations of small molecules.

Full details about this meeting will be posted here: https://bids.github.io/MLStatsForum/.

The Berkeley Statistics and Machine Learning Forum meets biweekly to discuss current applications across a wide variety of research domains and software methodologies. Hosted by UC Berkeley Physics Professor and BIDS Senior Fellow Uros Seljak, these active sessions bring together domain scientists, statisticians and computer scientists who are either developing state-of-the-art methods or are interested in applying these methods in their research. Practical questions about the meetings can be directed to BIDS Fellow Francois Lanusse.  All interested members of the UC Berkeley and LBL communities are welcome and encouraged to attend. To receive email notifications about the meetings and upvote papers for discussion, please register here.

Speaker(s)

Samson Koelle

IGERT Graduate Student, eScience Institute, University of Washington

Sam studied math and biology at Columbia University. After two years of stem-cell biology research at the National Institutes of Health, which illustrated an urgent need for improved statistical tools, he came back to his hometown of Seattle to study statistics. He now works mainly on theoretical underpinnings of dimension reduction, with applications in chemistry and genomics. He has also interned in Amazon’s forecasting science team, and was a Chateaubriand Fellow at the University of Bordeaux.