Handling missing words by mapping across word vector representations


Vector based word representation models are often developed from very large corpora. However, we often encounter words in real world applications that are not available in a single vector model. In this paper, we present a novel Neural Network (NN) based approach for obtaining representations for words in a target model from another model, called the source model, where representations for the words are available, effectively pooling together their vocabularies. Our experiments show that the transformed vectors are well correlated with the native target model representations and that an extrinsic evaluation based on a word-to-word similarity task using the Simlex-999 dataset leads to results close to those obtained using native model representations.

Publication Title

Proceedings of the 29th International Florida Artificial Intelligence Research Society Conference, FLAIRS 2016

This document is currently not available here.