Assessing student paraphrases using lexical semantics and word weighting


We present in this paper an approach to assessing student paraphrases in the intelligent tutoring system iSTART. The approach is based on measuring the semantic similarity between a student paraphrase and a reference text, called the textbase. The semantic similarity is estimated using knowledge-based word relatedness measures. The relatedness measures rely on knowledge encoded in Word-Net, a lexical database of English. We also experiment with weighting words based on their importance. The word importance information was derived from an analysis of word distributions in 2,225,726 documents from Wikipedia. Performance is reported for 12 different models which resulted from combining 3 different relatedness measures, 2 word sense disambiguation methods, and 2 word-weighting schemes. Furthermore, comparisons are made to other approaches such as Latent Semantic Analysis and the Entailer. © 2009 The authors and IOS Press. All rights reserved.

Publication Title

Frontiers in Artificial Intelligence and Applications