Abstract: Astronomical surveys of celestial sources produce streams of noisy time series measuring flux versus time (‘light curves’). Unlike in many other physical domains, however, large (and source-specific) temporal gaps in data arise naturally due to intranight cadence choices as well as diurnal and seasonal constraints. With nightly observations of millions of vari-able stars and transients from upcoming surveys, efficient and accurate discovery and classification techniques on noisy, irregularly sampled data must be employed with minimal human-in-the-loop involvement. Machine learning for infer-ence tasks on such data traditionally requires the laborious hand-coding of domain-specific numerical summaries of raw data (‘features’). Here, we present a novel unsupervised autoencoding recurrent neural network8 that makes explicit use of sampling times and known heteroskedastic noise prop-erties. When trained on optical variable star catalogues, this network produces supervised classification models that rival other best-in-class approaches. We find that autoencoded features learned in one time-domain survey perform nearly as well when applied to another survey. These networks can continue to learn from new unlabelled observations and may be used in other unsupervised tasks, such as forecasting and anomaly detection.
A recurrent neural network for classification of unevenly sampled variable stars
November 27, 2017 | Nature Astronomy | Brett Naul, Joshua S. Bloom, Fernando Pérez and Stéfan van der Walt