Tracing the Emergence of Gendered Language in Childhood
- Ben Prystawski, Department of Computer Science, Cognitive Science Program, University of Toronto, Toronto, Ontario, Canada
- Erin Grant, EECS Department, U.C. Berkeley, Berkeley, California, United States
- Aida Nematzadeh, DeepMind, London, United Kingdom
- Spike W. S. Lee, Rotman School of Management and Department of Psychology, University of Toronto, Toronto, Ontario, Canada
- Suzanne Stevenson, Department of Computer Science, University of Toronto, Toronto, Ontario, Canada
- Yang Xu, University of Toronto, Toronto, Ontario, Canada
AbstractAre gender associations in general language reflected in the words spoken to and by children? Previous work has suggested that language reveals gender differences in discourse, speech style, language use and acquisition. Work in artificial intelligence has shown that word embeddings trained on large corpora reflect human gender associations. We connect this work to developmental psychology by exploring whether gender associations in word embeddings are present in the linguistic input and output of children, and if so, how early gendered language emerges. We present a computational method that quantifies the gender associations of words and use a corpus of child-caretaker speech to show that these gender associations correlate significantly with those in word embeddings. We discover that gendered word use emerges in English-speaking children around age 2, and the gender associations cannot be explained solely by variables including word length, frequency, concreteness, and valence.