2008 Eighth International Conference on Intelligent Systems Design and Applications
Full text: Download
Text classification has been mostly performed through implicit semantic correlation techniques, such as latent se- mantic analysis. This approach however, has proved in- sufficient for situations where there are short texts to be classified into one or more from many classes. That is the case of the classification of statements of purpose of Brazil- ian companies, according to the around one thousand and eight hundred categories of the government administration detailment of National Classification of Economical Activi- ties (CNAE), CNAE-Subclasses. The impact of the order of words in a text is evaluated by comparing the performance of three classifiers based on the weightless artificial neural model, WISARD. Results point to the need of combining se- mantic with syntactic information in order to improve the classifiers performance.