On the Class Distribution Labelling Step Sensitivity of CO-TRAINING

Matsubara, Edson Takashi; Monard, Maria Carolina; Prati, Ronaldo C.

Published in

IFIP International Federation for Information Processing, p. 199-208

DOI: 10.1007/978-0-387-34747-9_21

Tools

Export citation

Search in Google Scholar

On the Class Distribution Labelling Step Sensitivity of CO-TRAINING

Journal article published in 1969 by Edson Takashi Matsubara, Maria Carolina Monard, Ronaldo C. Prati

This paper is made freely available by the publisher.

Full text: Download

Preprint: archiving allowed

Upload

Postprint: archiving allowed

Upload

Published version: archiving forbidden

Policy details

Data provided by

Abstract

Co-training can learn from datasets having a small number of labelled examples and a large number of unlabelled ones. It is an iterative algorithm where examples labelled in previous iterations are used to improve the classification of examples from the unlabelled set. However, as the number of initial labelled examples is often small we do not have reliable estimates regarding the underlying population which generated the data. In this work we make the claim that the proportion in which examples are labelled is a key parameter to co-training. Furthermore, we have done a series of experiments to investigate how the proportion in which we label examples in each step influences cotraining performance. Results show that co-training should be used with care in challenging domains. ; IFIP International Conference on Artificial Intelligence in Theory and Practice - Knowledge Acquisition and Data Mining

Published in

Links

Tools

On the Class Distribution Labelling Step Sensitivity of CO-TRAINING

Abstract