Institute of Electrical and Electronics Engineers, IEEE Transactions on Ultrasonics, Ferroelectrics and Frequency Control, 1(61), p. 207-213, 2014
Institute of Electrical and Electronics Engineers, IEEE Transactions on Ultrasonics, Ferroelectrics and Frequency Control, 1(61), p. 207-213
DOI: 10.1109/tuffc.2014.6689790
Full text: Download
Deformation of tissue can be accurately estimated from radio-frequency ultrasound data using a 2-dimensional normalized cross correlation (NCC)-based algorithm. This procedure, however, is very computationally time-consuming. A major time reduction can be achieved by parallelizing the numerous computations of NCC. In this paper, two approaches for parallelization have been investigated: the OpenMP interface on a multi-CPU system and Compute Unified Device Architecture (CUDA) on a graphics processing unit (GPU). The performance of the OpenMP and GPU approaches were compared with a conventional Matlab implementation of NCC. The OpenMP approach with 8 threads achieved a maximum speed-up factor of 132 on the computing of NCC, whereas the GPU approach on an Nvidia Tesla K20 achieved a maximum speed-up factor of 376. Neither parallelization approach resulted in a significant loss in image quality of the elastograms. Parallelization of the NCC computations using the GPU, therefore, significantly reduces the computation time and increases the frame rate for motion estimation.