A Sentence similarity method based on chunking and information content
Abstract
This paper introduces a method for assessing the semantic similarity between sentences, which relies on the assumption that the meaning of a sentence is captured by its syntactic constituents and the dependencies between them. We obtain both the constituents and their dependencies from a syntactic parser. Our algorithm considers that two sentences have the same meaning if it can find a good mapping between their chunks and also if the chunk dependencies in one text are preserved in the other. Moreover, the algorithm takes into account that every chunk has a different importance with respect to the overall meaning of a sentence, which is computed based on the information content of the words in the chunk. The experiments conducted on a well-known paraphrase data set show that the performance of our method is comparable to state of the art. © 2014 Springer-Verlag Berlin Heidelberg.
Publication Title
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Recommended Citation
Ştefǎnescu, D., Banjade, R., & Rus, V. (2014). A Sentence similarity method based on chunking and information content. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 8403 LNCS (PART 1), 442-453. https://doi.org/10.1007/978-3-642-54906-9_36