Can word probabilities from LDA be simply added up to represent documents?


This paper provides an alternative way of document representation by treating topic probabilities as a vector representation for words and representing a document as a combination of the word vectors. A comparison on summary data shows that this representation is more effective in document classification.

Proceedings of the 9th International Conference on Educational Data Mining, EDM 2016

