Dissemin is shutting down on January 1st, 2025

Published in

2008 Eighth International Conference on Intelligent Systems Design and Applications

DOI: 10.1109/isda.2008.299

Links

Tools

Export citation

Search in Google Scholar

The Influence of Order on a Large Bag of Words

This paper is available in a repository.
This paper is available in a repository.

Full text: Download

Green circle
Preprint: archiving allowed
Green circle
Postprint: archiving allowed
Red circle
Published version: archiving forbidden
Data provided by SHERPA/RoMEO

Abstract

Text classification has been mostly performed through implicit semantic correlation techniques, such as latent se- mantic analysis. This approach however, has proved in- sufficient for situations where there are short texts to be classified into one or more from many classes. That is the case of the classification of statements of purpose of Brazil- ian companies, according to the around one thousand and eight hundred categories of the government administration detailment of National Classification of Economical Activi- ties (CNAE), CNAE-Subclasses. The impact of the order of words in a text is evaluated by comparing the performance of three classifiers based on the weightless artificial neural model, WISARD. Results point to the need of combining se- mantic with syntactic information in order to improve the classifiers performance.