Tracing the Emergence of Gendered Language in Childhood

AbstractAre gender associations in general language reflected in the words spoken to and by children? Previous work has suggested that language reveals gender differences in discourse, speech style, language use and acquisition. Work in artificial intelligence has shown that word embeddings trained on large corpora reflect human gender associations. We connect this work to developmental psychology by exploring whether gender associations in word embeddings are present in the linguistic input and output of children, and if so, how early gendered language emerges. We present a computational method that quantifies the gender associations of words and use a corpus of child-caretaker speech to show that these gender associations correlate significantly with those in word embeddings. We discover that gendered word use emerges in English-speaking children around age 2, and the gender associations cannot be explained solely by variables including word length, frequency, concreteness, and valence.

