DTO-SMOTE: Delaunay Tessellation Oversampling for Imbalanced Data Sets

de Carvalho, Alexandre M.; Prati, Ronaldo C.

Published in

MDPI, Information, 12(11), p. 557, 2020

DOI: 10.3390/info11120557

Tools

Export citation

Search in Google Scholar

DTO-SMOTE: Delaunay Tessellation Oversampling for Imbalanced Data Sets

Journal article published in 2020 by Alexandre M. de Carvalho

, Ronaldo C. Prati

This paper is made freely available by the publisher.

Full text: Download

Preprint: archiving allowed

Upload

Postprint: archiving allowed

Upload

Published version: archiving allowed

Upload

Policy details

Data provided by

Abstract

One of the significant challenges in machine learning is the classification of imbalanced data. In many situations, standard classifiers cannot learn how to distinguish minority class examples from the others. Since many real problems are unbalanced, this problem has become very relevant and deeply studied today. This paper presents a new preprocessing method based on Delaunay tessellation and the preprocessing algorithm SMOTE (Synthetic Minority Over-sampling Technique), which we call DTO-SMOTE (Delaunay Tessellation Oversampling SMOTE). DTO-SMOTE constructs a mesh of simplices (in this paper, we use tetrahedrons) for creating synthetic examples. We compare results with five preprocessing algorithms (GEOMETRIC-SMOTE, SVM-SMOTE, SMOTE-BORDERLINE-1, SMOTE-BORDERLINE-2, and SMOTE), eight classification algorithms, and 61 binary-class data sets. For some classifiers, DTO-SMOTE has higher performance than others in terms of Area Under the ROC curve (AUC), Geometric Mean (GEO), and Generalized Index of Balanced Accuracy (IBA).

Published in

Links

Tools

DTO-SMOTE: Delaunay Tessellation Oversampling for Imbalanced Data Sets

Abstract