What makes categories difficult to classify?

Sun, Aixin; Lim, Ee-Peng; Liu, Ying

Published in

Proceeding of the 18th ACM conference on Information and knowledge management - CIKM '09

DOI: 10.1145/1645953.1646258

Tools

Export citation

Search in Google Scholar

What makes categories difficult to classify?

Proceedings article published in 2009 by Aixin Sun, Ee-Peng Lim, Ying Liu

This paper is available in a repository.

Full text: Download

Preprint: archiving allowed

Upload

Postprint: archiving allowed

Upload

Published version: archiving forbidden

Policy details

Data provided by

Abstract

In this paper, we try to predict which category will be less ac- curately classifled compared with other categories in a clas- siflcation task that involves multiple categories. The cat- egories with poor predicted performance will be identifled before any classiflers are trained and additional steps can be taken to address the predicted poor accuracies of these cat- egories. Inspired by the work on query performance predic- tion in ad-hoc retrieval, we propose to predict classiflcation performance using two measures, namely, category size and category coherence. Our experiments on 20-Newsgroup and Reuters-21578 datasets show that the Spearman rank corre- lation coe-cient between the predicted rank of classiflcation performance and the expected classiflcation accuracy is as high as 0.9.

Published in

Links

Tools

What makes categories difficult to classify?

Abstract