Application of LSA space's dimension character in document multi-hierarchy clustering


In LSA space, dimensions corresponding to bigger singular values reflect the general concept of language elements, while dimensions corresponding to smaller singular values reflect particular concept of language elements. On this basis, different dimensions of LSA space are adopted for document clustering under various concept granularities. In addition, in the LSA-based algorithm of document clustering, better clustering results are obtained by taking the row vectors of document self-indexing matrix as the objects to be clustered, instead of the document vectors with low dimensionality. © 2005 IEEE.

Publication Title

2005 International Conference on Machine Learning and Cybernetics, ICMLC 2005

This document is currently not available here.