Neural Language Models Capture Some, But Not All Agreement Attraction Effects

Suhas Arehalli, Dept of Cognitive Science, Johns Hopkins University, Baltimore, Maryland, United States
Tal Linzen, Department of Cognitive Science, Johns Hopkins University, Baltimore, Maryland, United States

AbstractThe number of the subject in English must match the number of the corresponding verb (a dog runs but dogs run). Yet in real-time language production and comprehension, speakers often mistakenly compute agreement between the verb and a grammatically irrelevant non-subject noun phrase instead. This phenomenon, referred to as agreement attraction, is modulated by a wide range of factors; any complete computational model of grammatical planning and comprehension would be expected to derive this rich empirical picture. Recent developments in Natural Language Processing have shown that neural networks trained only on word-prediction over large corpora are capable of capturing subject-verb agreement dependencies to a significant extent, but with occasional errors. In this paper, we evaluate the potential of such neural word prediction models as a foundation for a cognitive model of real-time grammatical processing. We use LSTMs, a common sequence prediction model used to model language, to simulate six experiments taken from the agreement attraction literature. The LSTMs captured the critical human behavior in three out of the six experiments, indicating that (1) some agreement attraction phenomena can be captured by a generic sequence processing model, but (2) capturing the other phenomena may require models with more language-specific mechanisms.

The Document

Return to previous page