An Efficient and Flexible Stochastic CGRA Mapping Approach

Das, Satyajit; Martin, Kevin; Peyret, Thomas; Coussy, Philippe

Published in

Association for Computing Machinery (ACM), ACM Transactions on Embedded Computing Systems, 1(22), p. 1-24, 2022

DOI: 10.1145/3550071

Tools

Export citation

Search in Google Scholar

An Efficient and Flexible Stochastic CGRA Mapping Approach

Journal article published in 2022 by Satyajit Das

, Kevin Martin

, Thomas Peyret

, Philippe Coussy

This paper was not found in any repository, but could be made available legally by the author.

Full text: Unavailable

Preprint: archiving allowed

Upload

Postprint: archiving allowed

Upload

Published version: archiving forbidden

Policy details

Data provided by

Abstract

Coarse-Grained Reconfigurable Array (CGRA) architectures are promising high-performance and power-efficient platforms. However, mapping applications efficiently on CGRA is a challenging task. This is known to be an NP complete problem. Hence, finding good mapping solutions for a given CGRA architecture within a reasonable time is complex. Additionally, finding scalability in compilation time and memory footprint for large heterogeneous CGRAs is also a well known problem. In this article, we present a stochastic mapping approach that can efficiently explore the architecture space and allows finding best of solutions while having limited and steady use of memory footprint. Experimental results show that our compilation flow allows to reach performances with low-complexity CGRA architectures that are as good as those obtained with more complex ones thanks to the better exploration of the mapping solution space. Parameters considered in our experiments are number of tiles, Register File (RF) size, number of load/store (LS) units, network topologies, and so on. Our results demonstrate that high-quality compilation for a wide range of applications is possible within reasonable run-times. Experiments with several DSP benchmarks show that the best CGRA configuration from the architectural exploration surpasses an ultra low-power DSP optimized RISC-V CPU to achieve up to 15.28× (with an average of 6× and minimum of 3.4×) performance gain and 29.7× (with an average of 13.5× and minimum of 6.3×) energy gain with an area overhead of 1.5× only.

Published in

Links

Tools

An Efficient and Flexible Stochastic CGRA Mapping Approach

Abstract