Comparing and Improving Active Learning Uncertainty Measures for Transformer Models by Discarding Outliers

Gonsior, Julius; Falkenberg, Christian; Magino, Silvio; Reusch, Anja; Hartmann, Claudio; Thiele, Maik; Lehner, Wolfgang

Published in

Springer, Information Systems Frontiers, 2024

DOI: 10.1007/s10796-024-10503-z

Tools

Export citation

Search in Google Scholar

Comparing and Improving Active Learning Uncertainty Measures for Transformer Models by Discarding Outliers

Journal article published in 2024 by Julius Gonsior

, Christian Falkenberg, Silvio Magino, Anja Reusch

, Claudio Hartmann

, Maik Thiele

, Wolfgang Lehner

This paper was not found in any repository, but could be made available legally by the author.

Full text: Unavailable

Preprint: archiving allowed

Upload

Postprint: archiving restricted

Upload

Published version: archiving forbidden

Policy details

Data provided by

Abstract

AbstractDespite achieving state-of-the-art results in nearly all Natural Language Processing applications, fine-tuning Transformer-encoder based language models still requires a significant amount of labeled data to achieve satisfying work. A well known technique to reduce the amount of human effort in acquiring a labeled dataset is Active Learning (AL): an iterative process in which only the minimal amount of samples is labeled. AL strategies require access to a quantified confidence measure of the model predictions. A common choice is the softmax activation function for the final Neural Network layer. In this paper, we compare eight alternatives on seven datasets and show that the softmax function provides misleading probabilities. Our finding is that most of the methods primarily identify hard-to-learn-from samples (commonly called outliers), resulting in worse than random performance, instead of samples, which actually reduce the uncertainty of the learned language model. As a solution, this paper proposes Uncertainty-Clipping, a heuristic to systematically exclude samples, which results in improvements for most methods compared to the softmax function.

Published in

Links

Tools

Comparing and Improving Active Learning Uncertainty Measures for Transformer Models by Discarding Outliers

Abstract