BIDS Machine Learning and Science Forum — Understanding overparameterized neural networks

ML&Sci Forum

March 29, 2021
11:00am to 12:00pm
Virtual Participation

BIDS Machine Learning and Science Forum
Date: Monday, March 29, 2021 (Rescheduled from March 22)
Time: 11:00 AM - 12:00 PM Pacific Time
Location: Participate remotely using this Zoom link 

Understanding overparameterized neural networks

Speaker: Jascha Sohl-Dickstein, Senior Staff Research Scientist, Google Brain
Abstract: As neural networks become wider their accuracy improves, and their behavior becomes easier to analyze theoretically. I will give an introduction to a rapidly growing field -- closely connected to statistical physics -- which examines the learning dynamics and prior over functions induced by infinitely wide, randomly initialized, neural networks. Core results that I will discuss include: that the distribution over functions computed by a wide neural network often corresponds to a Gaussian process with a particular compositional kernel, both before and after training; that the predictions of wide neural networks are linear in their parameters throughout training; and that this perspective enables analytic predictions for how trainability of finite width networks depends on hyperparameters and architecture. These results provide for surprising capabilities -- for instance, the evaluation of test set predictions which would come from an infinitely wide trained neural network without ever instantiating a neural network, or the rapid training of 10,000+ layer convolutional networks. I will argue that this growing understanding of neural networks in the limit of infinite width is foundational for future theoretical and practical understanding of deep learning. Neural Tangents (software library for working with infinite width networks): https://github.com/google/neural-tangents.

The BIDS Machine Learning and Science Forum meets biweekly to discuss current applications across a wide variety of research domains in the physical sciences and beyond. These active sessions bring together domain scientists, statisticians, and computer scientists who are either developing state-of-the-art methods or are interested in applying these methods in their research. This Forum is organized by BIDS Faculty Affiliate Uroš Seljak (professor of Physics at UC Berkeley), BIDS Research Affiliate Ben Nachman (Physicist at Lawrence Berkeley National Laboratory), Vanessa Böhm and Ben Erichson. All interested members of the UC Berkeley and Berkeley Lab communities are welcome and encouraged to attend. To receive email notifications about upcoming meetings, or to request more information, please contact berkeleymlforum@gmail.com.

Speaker(s)

Jascha Sohl-Dickstein

Senior Staff Research Scientist, Google Brain

Jascha Sohl-Dickstein is a senior staff research scientist in Google Brain, and leads a research team with interests spanning machine learning, physics, and neuroscience. He was previously a visiting scholar in Surya Ganguli's lab at Stanford, and an academic resident at Khan Academy. He earned his PhD in 2012 in Bruno Olshausen's lab in the Redwood Center for Theoretical Neuroscience at UC Berkeley. Prior to his PhD, he spent several years working for NASA on the Mars Exploration Rover mission.