2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
DOI: 10.1109/icassp.2017.7952340
The increasing pervasion and scale of machine learning tech- nologies is posing fundamental challenges for their realisa- tion. In the main, current algorithms are centralised, with a large number of processing agents, distributed across parallel processing resources, accessing a single, very large data ob- ject. This creates bottlenecks as a result of limited memory access rates. Distributed learning has the potential to resolve this problem by employing networks of co-operating agents each operating on subsets of the data, but as yet their suitabil- ity for realisation on parallel architectures such as multicore are unknown. This paper presents the results of a case study deploying distributed dictionary learning for microarray gene expression bi-clustering on a 16-core Epiphany multicore. It shows that distributed learning approaches can enable near- linear speed-up with the number of processing resources and, via the use of DMA-based communication, a 50% increase in throughput can be enabled.