Boosting Accuracy of Classical Machine Learning Antispam Classifiers in Real Scenarios by Applying Rough Set Theory

Pérez Díaz, N.; Ruano Ordás, D.; Fdez Riverola, F.; Méndez, J. R.

Links

ORCID
Crossref | PDF

Tools

Export citation

Search in Google Scholar

Boosting Accuracy of Classical Machine Learning Antispam Classifiers in Real Scenarios by Applying Rough Set Theory

Journal article published in 2016 by N. Pérez Díaz, D. Ruano Ordás, F. Fdez Riverola

, J. R. Méndez

This paper is available in a repository.

Full text: Download

Preprint: policy unknown

Upload

Postprint: policy unknown

Upload

Published version: policy unknown

Upload

Abstract

Nowadays, spam deliveries represent a major problem to benefit from the wide range of Internet-based communication forms. Despite the existence of different well-known intelligent techniques for fighting spam, only some specific implementations of Naïve Bayes algorithm are finally used in real environments for performance reasons. As long as some of these algorithms suffer from a large number of false positive errors, in this work we propose a rough set postprocessing approach able to significantly improve their accuracy. In order to demonstrate the advantages of the proposed method, we carried out a straightforward study based on a publicly available standard corpus (SpamAssassin), which compares the performance of previously successful well-known antispam classifiers (i.e., Support Vector Machines, AdaBoost, Flexible Bayes, and Naïve Bayes) with and without the application of our developed technique. Results clearly evidence the suitability of our rough set postprocessing approach for increasing the accuracy of previous successful antispam classifiers when working in real scenarios.