Dissemin is shutting down on January 1st, 2025

Published in

Springer, Lecture Notes in Computer Science, p. 24-35, 2005

DOI: 10.1007/11552253_3

Links

Tools

Export citation

Search in Google Scholar

Balancing Strategies and Class Overlapping

Proceedings article published in 2005 by Gustavo E. A. P. A. Batista, Ronaldo C. Prati ORCID, Maria Carolina Monard
This paper is available in a repository.
This paper is available in a repository.

Full text: Download

Red circle
Preprint: archiving forbidden
Orange circle
Postprint: archiving restricted
Red circle
Published version: archiving forbidden
Data provided by SHERPA/RoMEO

Abstract

Several studies have pointed out that class imbalance is a bottleneck in the performance achieved by standard supervised learning systems. However, a complete understanding of how this problem aects the performance of learning is still lacking. In previous work we identified that performance degradation is not solely caused by class imbalances, but is also related to the degree of class overlapping. In this work, we conduct our research a step further by investigating sampling strategies which aim to balance the training set. Our results show that these sam- pling strategies usually lead to a performance improvement for highly imbalanced data sets having highly overlapped classes. In addition, over- sampling methods seem to outperform under-sampling methods.