Dissemin is shutting down on January 1st, 2025

Published in

Association for Computing Machinery (ACM), ACM Transactions on Architecture and Code Optimization, 1(18), p. 1-26, 2021

DOI: 10.1145/3427093

Links

Tools

Export citation

Search in Google Scholar

Efficient Auto-Tuning of Parallel Programs with Interdependent Tuning Parameters via Auto-Tuning Framework (ATF)

Journal article published in 2021 by Ari Rasch ORCID, Richard Schulze, Michel Steuwer ORCID, Sergei Gorlatch
This paper is made freely available by the publisher.
This paper is made freely available by the publisher.

Full text: Download

Red circle
Preprint: archiving forbidden
Red circle
Postprint: archiving forbidden
Red circle
Published version: archiving forbidden
Data provided by SHERPA/RoMEO

Abstract

Auto-tuning is a popular approach to program optimization: it automatically finds good configurations of a program’s so-called tuning parameters whose values are crucial for achieving high performance for a particular parallel architecture and characteristics of input/output data. We present three new contributions of the Auto-Tuning Framework (ATF), which enable a key advantage in general-purpose auto-tuning : efficiently optimizing programs whose tuning parameters have interdependencies among them. We make the following contributions to the three main phases of general-purpose auto-tuning: (1) ATF generates the search space of interdependent tuning parameters with high performance by efficiently exploiting parameter constraints; (2) ATF stores such search spaces efficiently in memory, based on a novel chain-of-trees search space structure; (3) ATF explores these search spaces faster, by employing a multi-dimensional search strategy on its chain-of-trees search space representation. Our experiments demonstrate that, compared to the state-of-the-art, general-purpose auto-tuning frameworks, ATF substantially improves generating, storing, and exploring the search space of interdependent tuning parameters, thereby enabling an efficient overall auto-tuning process for important applications from popular domains, including stencil computations, linear algebra routines, quantum chemistry computations, and data mining algorithms.