Institute of Electrical and Electronics Engineers, IEEE Transactions on Antennas and Propagation, 5(62), p. 2679-2687, 2014
Full text: Download
This paper investigates the parallel, distributed-memory computation of the translation operator with $L+1$ multipoles in the three-dimensional Multilevel Fast Multipole Algorithm (MLFMA). A baseline, communication-free parallel algorithm can compute such a translation operator in $ {cal O}(L)$ time, using $ {cal O}(L^{2})$ processes. We propose a parallel algorithm that reduces this complexity to $ {cal O}(log L)$ time. This complexity is theoretically supported and experimentally validated up to 16 384 parallel processes. For realistic cases, the implementation of the proposed algorithm proves to be up to ten times faster than the baseline algorithm. For a large-scale parallel MLFMA simulation with 4096 parallel processes, the runtime for the computation of all translation operators during the setup stage is reduced from roughly one hour to only a few minutes.