Published in

Proceedings of the 7th Balkan Conference on Informatics Conference - BCI '15

DOI: 10.1145/2801081.2801116

Links

Tools

Export citation

Search in Google Scholar

Dealing with noisy data in the context of k-NN Classification

Proceedings article published in 2015 by Stefanos Ougiaroglou, Georgios Evangelidis
This paper is available in a repository.
This paper is available in a repository.

Full text: Download

Green circle
Preprint: archiving allowed
Green circle
Postprint: archiving allowed
Red circle
Published version: archiving forbidden
Data provided by SHERPA/RoMEO

Abstract

Like many other classifiers, k-NN classifier is noise-sensitive. Its accuracy highly depends on the quality of the training data. Noise and mislabeled data, as well as outliers and overlaps between data regions of different classes, lead to less accurate classification. This problem can be dealt with by adopting either a large k value or by pre-processing the training set with an editing algorithm. The first strategy involves trial-and-error attempts to tune the value of k, while the second strategy constitutes a time-consuming pre-processing step. This paper discusses and compares these two strategies and reveals their advantages and drawbacks.