Inselect: Automating the Digitization of Natural History Collections

Lawrence N. Hudson , Vladimir Blagoderov, Alice Heaton, Pieter Holtzhausen, Laurence Livermore, Benjamin W. Price, Stéfan van der Walt, Vincent S. Smith


November 23, 2015

The world’s natural history collections constitute an enormous evidence base for scientific research on the natural world. To facilitate these studies and improve access to collections, many organisations are embarking on major programmes of digitization. This requires automated approaches to mass-digitization that support rapid imaging of specimens and associated data capture, in order to process the tens of millions of specimens common to most natural history collections. In this paper the authors present Inselect—a modular, easy-to-use, cross-platform suite of open-source software tools that supports the semi-automated processing of specimen images generated by natural history digitization programmes. The software is made up of a Windows, Mac OS X, and Linux desktop application, together with command-line tools that are designed for unattended operation on batches of images. Blending image visualisation algorithms that automatically recognise specimens together with workflows to support post-processing tasks such as barcode reading, label transcription and metadata capture, Inselect fills a critical gap to increase the rate of specimen digitization.

Featured Fellows

Stéfan van der Walt

Senior Research Data Scientist