Scaling Lattice QCD beyond 100 GPUs



TIME: 2:00PM - 2:30PM

AUTHOR(S):Ronald Babich, Michael A. Clark, Bálint Joó, Guochun Shi, Richard C. Brower, Steven Gottlieb


Over the past five years, graphics processing units (GPUs) have had a transformational effect on numerical lattice quantum chromodynamics (LQCD) calculations in nuclear and particle physics. While GPUs have been applied with great success to the post-Monte Carlo ``analysis'' phase which accounts for a substantial fraction of the workload in a typical LQCD calculation, the initial Monte Carlo ``gauge field generation'' phase requires capability-level supercomputing, corresponding to O(100) GPUs or more. Such strong scaling has not been previously achieved. In this contribution we demonstrate that using a multi-dimensional parallelization strategy and a domain-decomposed preconditioner allows us to scale into this regime. We present results for two popular discretizations of the Dirac operator, Wilson-clover and improved staggered, employing up to 256 GPUs on the Edge cluster at Lawrence Livermore National Laboratory.

Chair/Author Details:

Ronald Babich - Boston University

Michael A. Clark - Harvard University

Bálint Joó - Thomas Jefferson National Accelerator Facility

Guochun Shi - National Center for Supercomputing Applications

Richard C. Brower - Boston University

Steven Gottlieb - Indiana University

