Evaluating Sensitivity to the Stick Breaking Prior in Bayesian Nonparametrics

Runjing Liu, Ryan Giordano, Michael I. Jordan, Tamara Broderick

arXiv.org
October 15, 2018

A central question in many probabilistic clustering problems is how many distinct clusters are present in a particular dataset. A Bayesian nonparametric (BNP) model addresses this question by placing a generative process on cluster assignment. However, like all Bayesian approaches, BNP requires the specification of a prior. In practice, it is important to quantitatively establish that the prior is not too informative, particularly when the particular form of the prior is chosen for mathematical convenience rather than because of a considered subjective belief. We derive local sensitivity measures for a truncated variational Bayes (VB) approximation and approximate nonlinear dependence of a VB optimum on prior parameters using a local Taylor series approximation. Using a stick-breaking representation of a Dirichlet process, we consider perturbations both to the scalar concentration parameter and to the functional form of the stick- breaking distribution. Unlike previous work on local Bayesian sensitivity for BNP, we pay special attention to the ability of our sensitivity measures to extrapolate to different priors, rather than treating the sensitivity as a measure of robustness per se. Extrapolation motivates the use of multiplicative perturbations to the functional form of the prior for VB. Additionally, we linearly approximate only the computationally intensive part of inference -- the optimization of the global parameters -- and retain the nonlinearity of easily computed quantities as functions of the global parameters. We apply our methods to estimate sensitivity of the expected number of distinct clusters present in the Iris dataset to the BNP prior specification. We evaluate the accuracy of our approximations by comparing to the much more expensive process of re-fitting the model.

Presented at NIPS 2018 BNP Workshop on December 7, 2018.

Statistics