Faculty Publications

Automatic Chinese character similarity measurement

Ming Liu, Southwest University
Vasile Rus, University of Memphis
Yue Li, Southwest University
Chuqian Sheng, Southwest University
Li Liu, Chongqing University

Abstract

Automatically identifying Chinese characters that are similar in their glyph, pronunciations and meaning are important for building smart question generation tools in a computer-assisted language-learning environment. Previous research on the Chinese character similarity measurement focused on character glyph (e.g. structures, strokes and radicals) with heuristic algorithms whose parameter have preset values. This article presents a machine learning (regression) approach to measure the similarity between two Chinese characters, based on the information which not only includes the glyph, but also pronunciation (pinyin) and semantic meaning derived from HowNet. We evaluated various regression models using a testing set consisting of 2586 pairs of characters selected from elementary Chinese textbooks used. The study results showed that four regression models (M5, Support Vector Machine, Gaussian Process and Linear Regression) have similar results (0.617 -1/2 Mean Absolute Error -1/2 0.641, 0.772 - 1/2 Root Mean Square Error 1/2 0.790). In addition, the study implied that the performance of the regression model could be influenced by the character frequency. Moreover, we evaluated the regression model in a well-known Chinese language learning resource, called 100 pairs of the most confusing Chinese characters. The experiment results indicated that this approach has potential in the recognition and generation of confusing Chinese character pairs.

Publication Title

Web Intelligence

Recommended Citation

Liu, M., Rus, V., Li, Y., Sheng, C., & Liu, L. (2018). Automatic Chinese character similarity measurement. Web Intelligence, 16 (3), 195-202. https://doi.org/10.3233/WEB-180387

Link to Full Text

COinS

Faculty Publications

Automatic Chinese character similarity measurement

Abstract

Publication Title

Recommended Citation

Search

Browse

Author Corner

Libraries

Faculty Publications

Automatic Chinese character similarity measurement

Authors

Abstract

Publication Title

Recommended Citation

Share

Search

Browse

Author Corner

Libraries