Links

Tools

Export citation

Search in Google Scholar

Patent SMT Based on Combined Phrases for NTCIR-7

Journal article published in 1 by Zhu Junguo, Qi Haoliang, Yang Muyun, Li Jufeng, Li Sheng
This paper was not found in any repository; the policy of its publisher is unknown or unclear.
This paper was not found in any repository; the policy of its publisher is unknown or unclear.

Full text: Unavailable

Question mark in circle
Preprint: policy unknown
Question mark in circle
Postprint: policy unknown
Question mark in circle
Published version: policy unknown

Abstract

In this paper, we describe a combined phrase approach to the Statistical Machine Translation of Japanese patents into English. To resolve the segmentation errors caused by the rich OOV (out-of-vocabulary) words in the patent texts, the character based translation phrases are first employed. Then the word based translation phrases are established to utilize the dependable word level information. Finally the two translation phrases tables are linearly combined to capture both character and word level translation correspondences. Preliminary experiments on NTCIR-7 corpus indicate that the BLEU scores of the proposed method significantly out-perform the usual word based approach.