Information Bottleneck Theory Based Exploration of Cascade Learning

Du, Xin; Farrahi, Katayoun; Niranjan, Mahesan

Published in

MDPI, Entropy, 10(23), p. 1360, 2021

DOI: 10.3390/e23101360

Tools

Export citation

Search in Google Scholar

Information Bottleneck Theory Based Exploration of Cascade Learning

Journal article published in 2021 by Xin Du

, Katayoun Farrahi

, Mahesan Niranjan

This paper is made freely available by the publisher.

Full text: Download

Preprint: archiving allowed

Upload

Postprint: archiving allowed

Upload

Published version: archiving allowed

Upload

Policy details

Data provided by

Abstract

In solving challenging pattern recognition problems, deep neural networks have shown excellent performance by forming powerful mappings between inputs and targets, learning representations (features) and making subsequent predictions. A recent tool to help understand how representations are formed is based on observing the dynamics of learning on an information plane using mutual information, linking the input to the representation (I(X;T)) and the representation to the target (I(T;Y)). In this paper, we use an information theoretical approach to understand how Cascade Learning (CL), a method to train deep neural networks layer-by-layer, learns representations, as CL has shown comparable results while saving computation and memory costs. We observe that performance is not linked to information–compression, which differs from observation on End-to-End (E2E) learning. Additionally, CL can inherit information about targets, and gradually specialise extracted features layer-by-layer. We evaluate this effect by proposing an information transition ratio, I(T;Y)/I(X;T), and show that it can serve as a useful heuristic in setting the depth of a neural network that achieves satisfactory accuracy of classification.

Published in

Links

Tools

Information Bottleneck Theory Based Exploration of Cascade Learning

Abstract