Supporting the Data Science Lifecycle

Berkeley Distinguished Lectures in Data Science

Like other sciences, Computer Science is finding itself quickly changed by the ubiquity of data. In this talk, I will touch briefly on my view of data-centric work within Computer Science, and on the gaps that have opened as a result in computing research. Then I will touch on a sequence of projects in my group supporting data science lifecycles, which should be of interest more broadly across data-centric fields. This includes the Data Wrangler project (commercialized as Trifacta), and two new projects in the RISELab: Jarvis (an experiment management framework) and Ground (a data context system). These projects are targeted at “everyday data professionals” in both businesses and academia. This talk represents joint work with a host of collaborators — including Joey Gonzalez, Rolando Garcia and Vikram Sreekanti — and it should be accessible to anyone who works with data.

The Berkeley Distinguished Lectures in Data Science, co-hosted by the Berkeley Institute for Data Science (BIDS) and the Berkeley Division of Data Sciences, feature faculty doing visionary research that illustrates the character of the ongoing data, computational, inferential revolution. All campus community members are welcome and encouraged to attend.  Arrive at 3:30pm for tea, coffee and discussion prior to the formal presentation.

Speaker(s)

Joe Hellerstein

Professor, Electrical Engineering and Computer Sciences
University of California, Berkeley

Joseph M. Hellerstein is the Jim Gray Professor of Computer Science at the University of California, Berkeley.  His work focuses on data-centric systems and the way they drive computing, including parallel and distributed programming models, analytic data context, interactive data visualization and transformation, scalable machine learning, distributed systems and networking. He collaborates actively with colleagues in a wide variety of fields including Programming Languages, Human-Computer Interaction, Machine Learning, Networking, Security, and Theoretical Computer Science. He is an ACM Fellow, an Alfred P. Sloan Research Fellow and the recipient of three ACM-SIGMOD "Test of Time" awards for his research. In 2010, Fortune Magazine included him in their list of 50 smartest people in technology , and MIT's Technology Review magazine included his work on their TR10 list of the 10 technologies "most likely to change our world". Hellerstein is the co-founder and Chief Strategy Officer of Trifacta, a software vendor providing intelligent interactive solutions for wrangling data. He serves on the technical advisory boards of a number of computing and internet companies including EMC, SurveyMonkey, Captricity and Dato, and previously served as the Director of Intel Research, Berkeley.