Learning with Class Skews and Small Disjuncts

Prati, Ronaldo C.; Batista, Gustavo E. A. P. A.; Monard, Maria Carolina

Published in

Springer, Lecture Notes in Computer Science, p. 296-306, 2004

DOI: 10.1007/978-3-540-28645-5_30

Tools

Export citation

Search in Google Scholar

Learning with Class Skews and Small Disjuncts

Journal article published in 2004 by Ronaldo C. Prati

, Gustavo E. A. P. A. Batista, Maria Carolina Monard

This paper is available in a repository.

Full text: Download

Preprint: archiving forbidden

Postprint: archiving restricted

Upload

Published version: archiving forbidden

Policy details

Data provided by

Abstract

One of the main objectives of a Machine Learning { ML { system is to induce a classier that minimizes classication errors. Two relevant topics in ML are the understanding of which domain character- istics and inducer limitations might cause an increase in misclassica- tion. In this sense, this work analyzes two important issues that might inuence the performance of ML systems: class imbalance and error- prone small disjuncts. Our main objective is to investigate how these two important aspects are related to each other. Aiming at overcoming both problems we analyzed the behavior of two over-sampling methods we have proposed, namely Smote + Tomek links and Smote + ENN. Our results suggest that these methods are eectiv e for dealing with class imbalance and, in some cases, might help in ruling out some un- desirable disjuncts. However, in some cases a simpler method, Random over-sampling, provides compatible results requiring less computational resources.

Published in

Links

Tools

Learning with Class Skews and Small Disjuncts

Abstract