NIMBLE: Programming Statistical Algorithms for Graphical (Hierarchical) Models

January 25, 2016

The NIMBLE project is a jointly collaborative effort between the departments of Statistics; Computer Science; and Environmental Science, Policy and Management of UC Berkeley. NIMBLE stands for Numerical Inference for statistical Models using Bayesian and Likelihood Estimation. The key idea behind NIMBLE is to combine flexible hierarchical model specification with a system for programming statistical algorithms that can adapt to model structures. NIMBLE makes it possible to implement, distribute, and programmatically control and modify statistical algorithms, which can be applied to any stochastic model defined as a directed acyclic graph. The NIMBLE system provides a flexible language for declaring a wide range of hierarchical models, a framework for defining algorithms that operate on this representation of models, and a compiler for generating equivalent C++—all from within the R environment.

The design of NIMBLE combines several approaches that are novel among statistical software. NIMBLE adopts and extends the BUGS language to provide a general framework for hierarchical model specification. NIMBLE processes BUGS code to generate model objects that can be used by statistical algorithms: they can be queried for variable relationships and operated for simulations or probability density calculations. NIMBLE provides a domain specific language (DSL) embedded within R for writing model-generic statistical algorithms. The DSL can be thought of as a subset of R with some special functions for interacting with model objects. Programming algorithms in NIMBLE is a lot like programming in R, but the DSL formally represents a distinct language defined by what is allowed for compilation. Finally, NIMBLE provides a compiler to translate the statistical models and algorithms to corresponding C++. Compiled objects are interfaced and operated from within R.

To allow model-generic programming of statistical algorithms, NIMBLE allows algorithms to self-adapt to different model structures. This is accomplished using two-staged programming of algorithms. The one-time “setup” step is specified in R, which allows algorithms to specialize to models by querying a model's structure. The second “run-time” stage is specified in the NIMBLE DSL, which comprises the numerical core of statistical algorithms. Run-time functions can be evaluated natively in R or compiled to C++. The former allows easier debugging of algorithm logic, while the latter provides much faster execution. This separates the logically disjointed steps of model specialization and core algorithmic processing and uses the concepts of specialization and staged evaluation from computer science.

NIMBLE is available as an R package, although it is not currently available on CRAN. Check out our website (http://r-nimble.org/) for package downloads, our Google group user forum (nimble-users), or email the development team directly at nimble.stats@gmail.com.