Handling missing words by mapping across word vector representations
Vector based word representation models are often developed from very large corpora. However, we often encounter words in real world applications that are not available in a single vector model. In this paper, we present a novel Neural Network (NN) based approach for obtaining representations for words in a target model from another model, called the source model, where representations for the words are available, effectively pooling together their vocabularies. Our experiments show that the transformed vectors are well correlated with the native target model representations and that an extrinsic evaluation based on a word-to-word similarity task using the Simlex-999 dataset leads to results close to those obtained using native model representations.
Proceedings of the 29th International Florida Artificial Intelligence Research Society Conference, FLAIRS 2016
Banjade, R., Maharjan, N., Gautam, D., & Rus, V. (2016). Handling missing words by mapping across word vector representations. Proceedings of the 29th International Florida Artificial Intelligence Research Society Conference, FLAIRS 2016, 250-253. Retrieved from https://digitalcommons.memphis.edu/facpubs/2861