Towards Real-Time DNN Inference on Mobile Platforms with Model Pruning and Compiler Optimization

Niu, Wei; Zhao, Pu; Zhan, Zheng; Lin, Xue; Wang, Yanzhi; Ren, Bin

Published in

Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence, 2020

DOI: 10.24963/ijcai.2020/778

Tools

Export citation

Search in Google Scholar

Towards Real-Time DNN Inference on Mobile Platforms with Model Pruning and Compiler Optimization

Proceedings article published in 2020 by Wei Niu

, Pu Zhao, Zheng Zhan, Xue Lin, Yanzhi Wang, Bin Ren

This paper was not found in any repository; the policy of its publisher is unknown or unclear.

Full text: Unavailable

Preprint: policy unknown

Upload

Postprint: policy unknown

Upload

Published version: policy unknown

Upload

Abstract

High-end mobile platforms rapidly serve as primary computing devices for a wide range of Deep Neural Network (DNN) applications. However, the constrained computation and storage resources on these devices still pose significant challenges for real-time DNN inference executions. To address this problem, we propose a set of hardware-friendly structured model pruning and compiler optimization techniques to accelerate DNN executions on mobile devices. This demo shows that these optimizations can enable real-time mobile execution of multiple DNN applications, including style transfer, DNN coloring and super resolution.

Published in

Links

Tools

Towards Real-Time DNN Inference on Mobile Platforms with Model Pruning and Compiler Optimization

Abstract