Computerized summary scoring: crowdsourcing-based latent semantic analysis


In this study we developed and evaluated a crowdsourcing-based latent semantic analysis (LSA) approach to computerized summary scoring (CSS). LSA is a frequently used mathematical component in CSS, where LSA similarity represents the extent to which the to-be-graded target summary is similar to a model summary or a set of exemplar summaries. Researchers have proposed different formulations of the model summary in previous studies, such as pregraded summaries, expert-generated summaries, or source texts. The former two methods, however, require substantial human time, effort, and costs in order to either grade or generate summaries. Using source texts does not require human effort, but it also does not predict human summary scores well. With human summary scores as the gold standard, in this study we evaluated the crowdsourcing LSA method by comparing it with seven other LSA methods that used sets of summaries from different sources (either experts or crowdsourced) of differing quality, along with source texts. Results showed that crowdsourcing LSA predicted human summary scores as well as expert-good and crowdsourcing-good summaries, and better than the other methods. A series of analyses with different numbers of crowdsourcing summaries demonstrated that the number (from 10 to 100) did not significantly affect performance. These findings imply that crowdsourcing LSA is a promising approach to CSS, because it saves human effort in generating the model summary while still yielding comparable performance. This approach to small-scale CSS provides a practical solution for instructors in courses, and also advances research on automated assessments in which student responses are expected to semantically converge on subject matter content.

Publication Title

Behavior research methods