Does Language Model Surprisal Measure Code Comprehension?
- Casey Casalnuovo, Department of Computer Science, University of California, Davis, Davis, California, United States
- Premkumar Devanbu, Computer Science, UC Davis, Davis, California, United States
- Emily Morgan, Linguistics, UC Davis, Davis, California, United States
AbstractRecognition of the similarities between programming and natural languages has led to a boom in the adoption of language modeling techniques in tools that assist developers. However, language model surprisal, which guides the training and evaluation in many of these methods, has not been validated as a measure of cognitive difficulty for programming language comprehension as it has for natural language. We perform a controlled experiment to evaluate human comprehension on fragments of source code that are meaning-equivalent but with different surprisal. We find that more surprising versions of code take humans longer to finish answering correctly. We also provide practical guidelines to design future studies for code comprehension and surprisal.