Iterative Nearest Neighborhood Oversampling in Semisupervised Learning from Imbalanced Data

Li, Fengqi; Yu, Chuang; Yang, Nanhai; Xia, Feng; Li, Guangming; Kaveh-Yazdy, Fatemeh

Published in

Hindawi, Scientific World Journal, (2013), p. 1-9, 2013

DOI: 10.1155/2013/875450

Tools

Export citation

Search in Google Scholar

Iterative Nearest Neighborhood Oversampling in Semisupervised Learning from Imbalanced Data

Journal article published in 2013 by Fengqi Li

, Chuang Yu, Nanhai Yang, Feng Xia

, Guangming Li, Fatemeh Kaveh-Yazdy

This paper is made freely available by the publisher.

Full text: Download

Preprint: archiving allowed

Upload

Postprint: archiving allowed

Upload

Published version: archiving allowed

Upload

Policy details

Data provided by

Abstract

Transductive graph-based semisupervised learning methods usually build an undirected graph utilizing both labeled and unlabeled samples as vertices. Those methods propagate label information of labeled samples to neighbors through their edges in order to get the predicted labels of unlabeled samples. Most popular semi-supervised learning approaches are sensitive to initial label distribution which happened in imbalanced labeled datasets. The class boundary will be severely skewed by the majority classes in an imbalanced classification. In this paper, we proposed a simple and effective approach to alleviate the unfavorable influence of imbalance problem by iteratively selecting a few unlabeled samples and adding them into the minority classes to form a balanced labeled dataset for the learning methods afterwards. The experiments on UCI datasets and MNIST handwritten digits dataset showed that the proposed approach outperforms other existing state-of-art methods.

Published in

Links

Tools

Iterative Nearest Neighborhood Oversampling in Semisupervised Learning from Imbalanced Data

Abstract