Which sentence embeddings and which layers encode syntactic structure?
- M Kelly, College of Information Sciences and Technology, Pennsylvania State University, University Park, Pennsylvania, United States
- Yang Xu, Department of Computer Science, San Diego State University, San Diego, California, United States
- Jesus Calvillo, College of Information Sciences and Technology, The Pennsylvania State University, University Park, Pennsylvania, United States
- David Reitter, Google Research, New York City, New York, United States
AbstractRecent models of language have eliminated syntactic-semantic dividing lines. We explore the psycholinguistic implications of this development by comparing different types of sentence embeddings in their ability to encode syntactic constructions. Our study uses contrasting sentence structures known to cause syntactic priming effects, that is, the tendency in humans to repeat sentence structures after recent exposure. We compare how syntactic alternatives are captured by sentence embeddings produced by a neural language model (BERT) or by the composition of word embeddings (BEAGLE, HHM, GloVe). Dative double object vs. prepositional object and active vs. passive sentences are separable in the high-dimensional space of the sentence embeddings and can be classified with a high degree of accuracy. The results lend empirical support to the modern, computational, integrated accounts of semantics and syntax, and they shed light on the information stored at different layers in deep language models such as BERT.