Does bilingual input hurt? A simulation of language discrimination and clustering using i-vectors

Maureen de Seyssel, CoML, Laboratoire de Sciences Cognitives et Psycholinguistiques, ENS-PSL/CNRS/EHESS/INRIA, Paris, France
Emmanuel Dupoux, CoML, Laboratoire de Sciences Cognitives et Psycholinguistiques, ENS-PSL/CNRS/EHESS/INRIA, Paris, France

AbstractThe language discrimination process in infants has been successfully modeled using i-vector based systems, with results replicating several experimental findings. Still, recent work found intriguing results regarding the difference between monolingual and mixed-language exposure on language discrimination tasks. We use two carefully designed datasets, with an additional "bilingual'' condition on the i-vector model of language discrimination. Our results do not show any difference in the ability of discriminating languages between the three backgrounds, although we do replicate past observations that distant languages (English-Finnish) are easier to discriminate than close languages (English-German). We do, however, find a strong effect of background when testing for the ability of the learner to automatically sort sentences in language clusters: bilingual background being generally harder than mixed background (one speaker one language). Other analyses reveal that clustering is dominated by speakers information rather than by languages.

The Document

Return to previous page