NASA Earth Exchange (NEX): Big Data Challenges, High-Performance Computing, and Machine Learning Innovations

Data Science Lecture Series

Lecture

September 25, 2015
1:00pm to 2:30pm
190 Doe Library
Get Directions

NASA Earth Exchange (NEX) provides a unique collaborative platform for scientists and researchers around the world to do research in a scientifically complex area. NEX provides customized open source tools, scientific workflows, access to petabytes of satellite and climate data, models, and computing power. Over the past three years, NEX has evolved in terms of handling projects that deal with data complexity, model integration, and high-performance computing. Another unique aspect of NEX is its collaboration with Amazon Web Services (AWS) to create the OpenNEX platform, which leverages the full stack of AWS’s cloud computing platform to demonstrate scientifically relevant projects for government agencies, commercial companies, and other stakeholders. OpenNEX provides access to a wide variety of data through AWS’s public datasets program and virtual machines that replicates a certain workflow capturing data access, search, analysis, computation, and visualization. OpenNEX collaborated with Berkeley’s Geospatial Innovation Facility (GIF) to create an open source visualization dashboard for visualizing the downscaled climate projections dataset. A pressing need in both initiatives is how to deal with large image datasets and efficiently analyze these images using high-performance and cloud computing infrastructures. With funding from several NASA program elements (e.g., AIST, ACCESS, CMS), NEX has showcased activities in which new machine learning algorithms can be deployed and scaled across these computer architectures to process very high-resolution imagery datasets for object classification, segmentation, and feature extraction. An example relates to processing quarter million image scenes from the 1-m multispectral NAIP dataset to estimate tree cover for the continental United States given the large complexities and heterogeneity in land cover types. New computational techniques using open source tools and cloud architectures are a must in achieving performance efficiency in some of the heritage scientific research domains and analyses.   

Speaker(s)

Sangram Ganguly

Senior Research Scientist, NASA

I am a research scientist at the Biosphere Science Branch at NASA Ames Research Center, Moffett Field, California, and at the Bay Area Environmental Research Institute. 

My work leverages expertise across a range of disciplines, including cloud computing solutions for big data science and analytics, machine learning, advanced satellite remote sensing and image analytics, and climate sciences. 

I did my PhD at Boston University (USA). Prior to that, I graduated with an integrated masters (BS and MS) degree in geosciences from the Indian Institute of Technology (IIT), Kharagpur, India, in 2004. I am an active panelist for the NSF and NASA carbon and ecosystem programs and a science team member for the NASA Carbon Monitoring System Program. My research has been highlighted in mainstream news media, and I am the recipient of five NASA achievement awards that were recognized in the fields of ecosystem forecasting, climate science, and remote sensing. I am also a cofounding member of the NASA Earth Exchange Collaborative and Supercomputing Facility at NASA Ames and a founding member and developer of the OpenNEX Platform

Specialties 

Advanced remote sensing techniques and physical algorithms: radiative transfer theory, MODIS & MISR, Landsat-derived biophysical variables (e.g., LAI), Lidar and Radar remote sensing for biomass estimation, multi-sensor fusion for carbon flux estimation 

Image analytics: pattern recognition/image classification, big data architectures for large image manipulation and query, soft computing techniques (fuzzy, neuro-fuzzy, genetic algos.)

Climate modeling and dynamics

Advanced signal processing techniques for multi-dimensional and multi-temporal analysis of satellite imagery

Machine learning: deep learning algorithms for large image classification (e.g.,from Worldview, NAIP, landsat, etc.), biophysical parameter prediction, and climate gridding