Proceedings of the 7th Balkan Conference on Informatics Conference - BCI '15
Full text: Download
Like many other classifiers, k-NN classifier is noise-sensitive. Its accuracy highly depends on the quality of the training data. Noise and mislabeled data, as well as outliers and overlaps between data regions of different classes, lead to less accurate classification. This problem can be dealt with by adopting either a large k value or by pre-processing the training set with an editing algorithm. The first strategy involves trial-and-error attempts to tune the value of k, while the second strategy constitutes a time-consuming pre-processing step. This paper discusses and compares these two strategies and reveals their advantages and drawbacks.