SelfCode: An Annotated Corpus and a Model for Automated Assessment of Self-explanation during Source Code Comprehension
Abstract
The ability to automatically assess learners’ activities is the key to user modeling and personalization in adaptive educational systems. The work presented in this paper opens an opportunity to expand the scope of automated assessment from traditional programming problems to code comprehension tasks where students are requested to explain the critical steps of a program. The ability to automatically assess these self-explanations offers a unique opportunity to understand the current state of student knowledge, recognize possible misconceptions, and provide feedback. Annotated datasets are needed to train Artificial Intelligence/Machine Learning approaches for the automated assessment of student explanations. To answer this need, we present a novel corpus called SelfCode which consists of 1,770 sentence pairs of student and expert self-explanations of Java code examples, along with semantic similarity judgments provided by experts. We also present a baseline automated assessment model that relies on textual features. The corpus is available at GitHub repository 1.
Publication Title
Proceedings of the International Florida Artificial Intelligence Research Society Conference, FLAIRS
Recommended Citation
Chapagain, J., Risha, Z., Banjade, R., Oli, P., Tamang, L., Brusilovsky, P., & Rus, V. (2023). SelfCode: An Annotated Corpus and a Model for Automated Assessment of Self-explanation during Source Code Comprehension. Proceedings of the International Florida Artificial Intelligence Research Society Conference, FLAIRS, 36 https://doi.org/10.32473/flairs.36.133385