An Optimized Reduction Design to Minimize Atomic Operations in Shared Memory Multiprocessors

Speziale, Ettore; di Biagio, Andrea; Agosta, Giovanni

Published in

2011 IEEE International Symposium on Parallel and Distributed Processing Workshops and Phd Forum

DOI: 10.1109/ipdps.2011.271

Tools

Export citation

Search in Google Scholar

An Optimized Reduction Design to Minimize Atomic Operations in Shared Memory Multiprocessors

Proceedings article published in 2011 by Ettore Speziale, Andrea di Biagio, Giovanni Agosta

This paper is available in a repository.

Full text: Download

Preprint: archiving allowed

Upload

Postprint: archiving allowed

Upload

Published version: archiving forbidden

Policy details

Data provided by

Abstract

Reduction operations play a key role in modern massively data parallel computation. However, current implementations in shared memory programming APIs such as OpenMP are often computation bottlenecks due to the high number of atomic operations involved. We propose a reduction design that exploits the coupling with a barrier synchronization to optimize the execution of the reduction. Experimental results show how the number of atomic operations involved is dramatically reduced, which can lead to significant improvement in scaling properties on large numbers of processing elements. We report a speedup of 1.53x on the 312.swim_m SPEC OMP2001 benchmark and a speedup of 4.02x on the streamcluster benchmark from the PARSEC suite over the baseline.

Published in

Links

Tools

An Optimized Reduction Design to Minimize Atomic Operations in Shared Memory Multiprocessors

Abstract