Export citation

Search in Google Scholar

Tornado in Multilingual Opinion Analysis: A Transductive Learning Approach for Chinese Sentimental Polarity Recognition

Journal article published in 1 by Yu-Chieh Wu, Li-Wei Yang, Jeng-Yan Shen, Liang-Yu Chen, Shih-Tung Wu
This paper was not found in any repository; the policy of its publisher is unknown or unclear.
This paper was not found in any repository; the policy of its publisher is unknown or unclear.

Full text: Unavailable

Question mark in circle
Preprint: policy unknown
Question mark in circle
Postprint: policy unknown
Question mark in circle
Published version: policy unknown


In this paper, we present our statistical-based opinion analysis system for NTCIR-MOAT track this year. Our method involves two different approaches: (1) the machine learning-based prototype system (on the basis of support vector machines (SVMs)) and (2) stochastic estimation of the character-level of words. The former were the real applications of state-of-the-art machine learning algorithms, while the latter comprises of ad-hoc opinioned word, phrase analysis. We submitted both two runs to NTCIR-MOAT in this year. The prototype system was first designed for traditional Chinese. We also directly port it to Simplified Chinese text with dictionary-based word translation. To make the model more robust, we present the idea of transdutive learning to our models. The main advantage of this approach is that it learns the hypothesis from labeled data meanwhile adapt to the large unlabeled data. Our method could not only be applied to SVM-based approaches, but also is applicable with the other non-machine learning algorithms. The experimental results showed that our method (approach 1) can effectively identify the opinioned sentences in 0.661 and 0.611 f-measure rates under the lenient test. In terms of polarity judgment, our method achieves 0.284 and 0.294 in F-measure rates of the proposed two approaches, respectively. In the relevant sentence judgment track, our group achieved the best and the second best results among all other participants. Owing to the lack of labeled training data, we trust that our method could be further enhanced by feeding with more consistent and large annotated corpus.