Deconstructing Cross-Entropy for Probabilistic Binary Classifiers

Ramos, Daniel; Franco-Pedroso, Javier; Lozano-Diez, Alicia; Gonzalez-Rodriguez, Joaquin

Published in

MDPI, Entropy, 3(20), p. 208, 2018

DOI: 10.3390/e20030208

Tools

Export citation

Search in Google Scholar

Deconstructing Cross-Entropy for Probabilistic Binary Classifiers

Journal article published in 2018 by Daniel Ramos

, Javier Franco-Pedroso, Alicia Lozano-Diez

, Joaquin Gonzalez-Rodriguez

This paper is made freely available by the publisher.

Full text: Download

Preprint: archiving allowed

Upload

Postprint: archiving allowed

Upload

Published version: archiving allowed

Upload

Policy details

Data provided by

Abstract

In this work, we analyze the cross-entropy function, widely used in classifiers both as a performance measure and as an optimization objective. We contextualize cross-entropy in the light of Bayesian decision theory, the formal probabilistic framework for making decisions, and we thoroughly analyze its motivation, meaning and interpretation from an information-theoretical point of view. In this sense, this article presents several contributions: First, we explicitly analyze the contribution to cross-entropy of (i) prior knowledge; and (ii) the value of the features in the form of a likelihood ratio. Second, we introduce a decomposition of cross-entropy into two components: discrimination and calibration. This decomposition enables the measurement of different performance aspects of a classifier in a more precise way; and justifies previously reported strategies to obtain reliable probabilities by means of the calibration of the output of a discriminating classifier. Third, we give different information-theoretical interpretations of cross-entropy, which can be useful in different application scenarios, and which are related to the concept of reference probabilities. Fourth, we present an analysis tool, the Empirical Cross-Entropy (ECE) plot, a compact representation of cross-entropy and its aforementioned decomposition. We show the power of ECE plots, as compared to other classical performance representations, in two diverse experimental examples: a speaker verification system, and a forensic case where some glass findings are present.

Published in

Links

Tools

Deconstructing Cross-Entropy for Probabilistic Binary Classifiers

Abstract