Presented at the Physics in Machine Learning Workshop held at BIDS May 29, 2019.
Abstract: Large-scale particle physics experiments face challenging demands for high-throughput computing re- sources both now and in the future. The growing exploration of machine learning algorithms in particle physics offer new solutions for simulation, reconstruction, and analysis. These new machine learning solutions often lead to increased parallelization and faster reconstruction times on dedicated hardware, specifically Field Programmable Gate Arrays. We demonstrate that the acceleration of machine learning inference as a web service represents a heterogeneous computing solution for particle physics experiments that requires minimal modification to the current computing model. As examples, we apply weight returning to the ResNet-50 image classifier to demonstrate state-of-the-art performance for top jet tagging at the LHC, and transfer learning is applied to neutrino event classification. Using Project Brainwave by Microsoft to accelerate the ResNet-50 image classification model, we achieve average inference times of 60 (10) milliseconds with our experimental physics software framework using Brainwave as a cloud (edge) service. representing an improvement of ∼30× (175×) in model inference latency over traditional CPU inference in current experimental hardware. A single FPGA service accessed by many CPUs achieves a throughput of 600-700 inferences per second using an image batch of one, comparable to the throughput achieved using a GPU with a large batch size. Deployed as an edge or cloud service for the particle physics computing model, coprocessor accelerators can have a higher duty cycle and are potentially much more cost-effective.