Detecting social information in a dense database of infants’ natural visual experience

Bria Long, Department of Psychology, Stanford University, Stanford, California, United States
George Kachergis, Department of Psychology, Stanford University, Stanford, California, United States
Ketan Agrawal, Department of Psychology, Stanford University, Stanford, California, United States
Michael Frank, Psychology, Stanford University, Stanford, California, United States

AbstractThe faces and hands of caregivers and other social partners offer a rich source of social and causal information that may be critical for infants’ cognitive and linguistic development. Previous work using manual annotation strategies and cross-sectional data has found systematic changes in the proportion of faces and hands in the egocentric perspective of young infants. Here, we examine the prevalence of faces and hands in a longitudinal collection of nearly 1700 headcam videos collected from three children along a span of 6 to 32 months of age—the SAYCam dataset (Sullivan, Mei, Perfors, Wojcik, & Frank, under review). To analyze these naturalistic infant egocentric videos, we first validated the use of a modern convolutional neural network of pose detection (OpenPose) for the detection of faces and hands. We then applied this model to the entire dataset, and found a higher proportion of hands in view than previous reported and a moderate decrease the proportion of faces in children’s view across age. In addition, we found variability in the proportion of faces/hands viewed by different children in different locations (e.g., living room vs. kitchen), suggesting that individual activity contexts may shape the social information that infants experience.

The Document

Return to previous page