Integrating Semantics Into Developmental Models of Morphology Learning

AbstractA key challenge in language acquisition is learning morphological transforms relating word roots to derived forms. Traditional unsupervised algorithms find morphological patterns in sequences of phonemes, but struggle to distinguish valid segmentations from spurious ones because they ignore meaning. For example, a system that correctly discovers "add /z/" as a valid morphological transform (song-songs, year-years) might incorrectly infer that "add /ah.t/" is also valid (mark-market, spear-spirit). We propose that learners could avoid these errors with a simple semantic assumption: morphological transforms approximately preserve meaning. We extend an algorithm from Chan and Yang (2008) by integrating proximity in vector-space word embeddings as a criterion for valid transforms. On a corpus of child-directed speech, we achieve both higher accuracy and broader coverage than the purely phonemic approach, even in more developmentally plausible learning paradigms. Finally, we consider a deeper semantic assumption that could guide the acquisition of more abstract, human-like morphological understanding.


Return to previous page