Published in

Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence, 2020

DOI: 10.24963/ijcai.2020/778

Links

Tools

Export citation

Search in Google Scholar

Towards Real-Time DNN Inference on Mobile Platforms with Model Pruning and Compiler Optimization

Proceedings article published in 2020 by Wei Niu ORCID, Pu Zhao, Zheng Zhan, Xue Lin, Yanzhi Wang, Bin Ren
This paper was not found in any repository; the policy of its publisher is unknown or unclear.
This paper was not found in any repository; the policy of its publisher is unknown or unclear.

Full text: Unavailable

Question mark in circle
Preprint: policy unknown
Question mark in circle
Postprint: policy unknown
Question mark in circle
Published version: policy unknown

Abstract

High-end mobile platforms rapidly serve as primary computing devices for a wide range of Deep Neural Network (DNN) applications. However, the constrained computation and storage resources on these devices still pose significant challenges for real-time DNN inference executions. To address this problem, we propose a set of hardware-friendly structured model pruning and compiler optimization techniques to accelerate DNN executions on mobile devices. This demo shows that these optimizations can enable real-time mobile execution of multiple DNN applications, including style transfer, DNN coloring and super resolution.