ORES: Lowering Barriers with Participatory Machine Learning in Wikipedia

Aaron Halfaker, R. Stuart Geiger

arXiv.org
September 11, 2019

This paper presents an overview and case studies of Wikipedia’s real-time machine learning as a service platform (ORES), which was designed in line with Wikipedian values of open participation, decentralization, and continual iteration. Instead of trying to build the single best classifier for Wikipedia’s quality control and content moderation purposes, ORES facilitates multiple independent community-commissioned classifiers, built using different training data, ML techniques, and parameter sets. Over 100 classifers have been built with ORES, and all of them operate in real-time via an open API. ORES decouples and reduces incidental complexity around several aspects of applying machine learning in a user-generated content platform, including curating training data sets, building models to serve predictions, auditing predictions, and developing interfaces or automated agents that act on those predictions.

Abstract: Algorithmic systems -- from rule-based bots to machine learning classifiers -- have a long history of supporting the essential work of content moderation and other curation work in peer production projects. From counter-vandalism to task routing, basic machine prediction has allowed open knowledge projects like Wikipedia to scale to the largest encyclopedia in the world, while maintaining quality and consistency. However, conversations about how quality control should work and what role algorithms should play have generally been led by the expert engineers who have the skills and resources to develop and modify these complex algorithmic systems. In this paper, we describe ORES: an algorithmic scoring service that supports real-time scoring of wiki edits using multiple independent classifiers trained on different datasets. ORES decouples several activities that have typically all been performed by engineers: choosing or curating training data, building models to serve predictions, auditing predictions, and developing interfaces or automated agents that act on those predictions. This meta-algorithmic system was designed to open up socio-technical conversations about algorithmic systems in Wikipedia to a broader set of participants. In this paper, we discuss the theoretical mechanisms of social change ORES enables and detail case studies in participatory machine learning around ORES from the 4 years since its deployment.



Featured Fellows

R. Stuart Geiger

Ethnographer