EECS Data Science Course



December 11, 2014
3:30pm to 4:30pm
190 Doe Library
Get Directions

Topic: Poster Session from Data Science Students (CS194-16) in the Department of Electrical Engineering and Computer Sciences 

How the Brain Responds to Images (5 teams)
Data originally from Jack Gallant's group (fMRI recordings of people looking at particular images), curated by Karl Zipser from the Redwood Center (theoretical neuroscience). Goal is to understand and explore the mapping from images to various parts of the Brain, sometimes using hierarchical models of vision from Berkeley's Caffe toolkit. 

SuperNovae Classification from Telescope Images (4 teams)
Data from Peter Nugent's group in Astronomy. Massive numbers of small image sections from telescopes. The goal is to correctly recognize ephemeral objects like Supernovae. 

Classification of Museum Specimens (3 teams) 
The Berkeley Museums have millions of digitized but unclassified specimen images from collections in Zoology, Anthropology, Biology etc. Many of these include hand-written or typed labels, but the data they contain is disorganized, partial and sometimes occluded. The goal of these projects was to try to improve automated recognition of images ab initio, or to reconcile crowd-sourced annotations of them. 

City of San Francisco Data Challenges (5 teams)
The City of San Francisco has perhaps the most comprehensive and best-organized public data sources of major US cities. The goal of these projects was to explore some of the opportunities from using this data, e.g. using public reviews (Yelp etc.) to predict health inspections or 311 reports to predict crimes. 

Patent Author Unification (2 teams)
The US patent databases have poorly-resolved author identification. The goal this project was to improve author resolution, to help the development of better patent searching tools.

California Earthquake Prediction (1 team)
Still a problem everyone cares about, and still doesnt have a non-trivial solution. The goal was to match and hopefully improve on baseline models. 

BIDS Tea is hosted on Mondays and Thursdays from 3:30 to 4:30 p.m. At each event, an invited person or group will spend 5–10 minutes discussing a data science project or problem followed by time for discussion and networking.