Statistical Methods and Software for the Study of Olfactory Stem Cell Differentiation Using Single-Cell Transcriptome Sequencing

Berkeley Distinguished Lectures in Data Science

Single-cell transcriptome sequencing (scRNA-Seq), which combines high-throughput single-cell extraction and sequencing capabilities, enables the transcriptomes of large numbers of individual cells to be assayed efficiently. Profiling of gene expression at the single-cell level for a large sample of cells is crucial for addressing many biologically relevant questions, such as, the investigation of rare cell types or primary cells (e.g., stem cell differentiation) and the examination of subpopulations of cells from a larger heterogeneous population (e.g., classifying cells in brain tissues).

Dr. Dudoit will discuss some of the statistical and computational issues that have arisen in the context of a collaboration with the UC Berkeley Ngai Lab concerning the analysis of olfactory stem cell fate trajectories in mice. These issues, ranging from so-called low-level to high-level analysis, include: experimental design, exploratory data analysis (EDA) of scRNA-Seq reads, quality assessment/control (QA/QC), normalization to account for nuisance technical effects, cluster analysis to identify novel cell types, cell lineage and pseudotime inference, and differential expression analysis to identify genes involved in the differentiation process. This project's statistical methods are implemented in open-source R packages released through the Bioconductor Project.

The Berkeley Distinguished Lectures in Data Science, co-hosted by the Berkeley Institute for Data Science (BIDS) and the Berkeley Division of Data Sciences, features faculty doing visionary research that illustrates the character of the ongoing data, computational, inferential revolution.  In this inaugural Fall 2017 "local edition," we bring forward Berkeley faculty working in these areas as part of enriching the active connections among colleagues campus-wide.  All campus community members are welcome and encouraged to attend.  Arrive at 3:30pm for tea, coffee, and discussion.


Sandrine Dudoit

Professor, Division of Biostatistics and Department of Statistics

Sandrine Dudoit is professor of biostatistics and statistics and chair of the graduate group in biostatistics at the University of California, Berkeley. Professor Dudoit's methodological research interests regard high-dimensional inference and include exploratory data analysis, visualization, loss-based estimation with cross-validation (e.g., density estimation, regression, model selection), and multiple hypothesis testing. Much of her methodological work is motivated by statistical inference questions arising in biological research and, in particular, the design and analysis of high-throughput microarray and sequencing gene expression experiments, for example, mRNA-Seq for transcriptome analysis and genome annotation and ChIP-Seq for DNA-protein interaction profiling (e.g., transcription factor binding). Her contributions include exploratory data analysis, normalization and expression quantitation, differential expression analysis, class discovery, prediction, integration of biological annotation metadata (e.g., gene ontology annotation). She is also interested in statistical computing and, in particular, reproducible research. She is a founding core developer of the Bioconductor Project, an open source and open development software project for the analysis of biomedical and genomic data.

Professor Dudoit is a coauthor of the book Multiple Testing Procedures with Applications to Genomics and a coeditor of the book Bioinformatics and Computational Biology Solutions Using R and Bioconductor. She is associate editor of three journals, including The Annals of Applied Statistics and IEEE/ACM Transactions on Computational Biology and Bioinformatics. Professor Dudoit was named fellow of the American Statistical Association in 2010 and elected member of the International Statistical Institute in 2014.