Untangling Semantic Similarity: Modeling Lexical Processing Experiments with Distributional Semantic Models.

Farhan Samir, Computer Science, University of Toronto, Toronto, Ontario, Canada
Suzanne Stevenson, Department of Computer Science, University of Toronto, Toronto, Ontario, Canada
Barend Beekhuizen, Department of Language Studies, University of Toronto, Mississauga, Mississauga, Ontario, Canada

AbstractDistributional semantic models (DSMs) are substantially varied in the types of semantic similarity that they output. Despite this high variance, the different types of similarity are often conflated as a monolithic concept in models of behavioural data. We apply the insight that word2vec's representations can be used for capturing both paradigmatic similarity (substitutability) and syntagmatic similarity (co-occurrence) to two sets of experimental findings (semantic priming and the effect of semantic neighbourhood density) that have previously been modeled with monolithic conceptions of DSM-based semantic similarity. Using paradigmatic and syntagmatic similarity based on word2vec, we show that for some tasks and types of items the two types of similarity play complementary explanatory roles, whereas for others, only syntagmatic similarity seems to matter. These findings remind us that it is important to develop more precise accounts of what we believe our DSMs represent, and provide us with novel perspectives on established behavioural patterns.

The Document

Return to previous page