Why do infants learn language so fast? A reverse engineering approach
This project develops a computational model to explore how infants efficiently learn language through statistical learning and three additional mechanisms, aiming to produce comparable outcomes to children's language acquisition.
Projectdetails
Introduction
How do infants learn their first language(s)? The popular yet controversial 'statistical learning hypothesis' posits that they learn by gradually collecting statistics over their language inputs. This is strikingly similar to how current AI's Large Language Models (LLMs) learn and shows that simple statistical mechanisms may be sufficient to attain adult-like language competence.
Learning Data Comparison
But does it? Estimates of language inputs to children show that by age 3, they have received 2 or 3 orders of magnitude less data than LLMs of similar performance. The gap grows exponentially larger with children's age. Worse, when models are fed with speech instead of text, they learn even slower. How are infants so efficient learners?
Hypothesis Testing
This project tests the hypothesis that in addition to statistical learning, infants benefit from three mechanisms that accelerate their learning rate:
- They are born with a vocal tract which helps them understand the link between abstract motor commands and speech sounds, and decode noisy speech inputs more efficiently.
- They have an episodic memory enabling them to learn from unique events, instead of gradually learning from thousands of repetitions.
- They start with an evolved learning architecture optimized for generalization from few and noisy inputs.
Methodology
Our approach is to build a computational model of the learner (an infant simulator), which when fed realistic language input produces outcome measures comparable to children's (laboratory experiments, vocabulary estimates). This gives a quantitative estimate of the efficiency of each of the three mechanisms, as well as new testable predictions.
Language Focus
We start with English and French that have both accessible large annotated speech corpora and documented acquisition landmarks and focus on the first three years of life. We then help build similar resources across a larger set of languages by fostering a cross-disciplinary community that shares tools, data, and analysis methods.
Financiële details & Tijdlijn
Financiële details
Subsidiebedrag | € 2.494.625 |
Totale projectbegroting | € 2.494.625 |
Tijdlijn
Startdatum | 1-1-2025 |
Einddatum | 31-12-2029 |
Subsidiejaar | 2025 |
Partners & Locaties
Projectpartners
- ECOLE DES HAUTES ETUDES EN SCIENCES SOCIALESpenvoerder
- ECOLE NORMALE SUPERIEURE
Land(en)
Vergelijkbare projecten binnen European Research Council
Project | Regeling | Bedrag | Jaar | Actie |
---|---|---|---|---|
Gates to LanguageThe GALA project investigates the biological mechanisms of language acquisition in humans and nonhuman species to uncover why only humans can learn language. | ERC Starting... | € 1.490.057 | 2024 | Details |
Infant verbal Memory in Development: a window for understanding language constraints and brain plasticity from birthIN-MIND investigates the development of verbal memory in infants to understand its role in language learning, using innovative methods to identify memory capacities and intervention windows. | ERC Starting... | € 1.499.798 | 2022 | Details |
Multiple routes to memory for a second language: Individual and situational factorsThis project investigates alternative routes to second language acquisition by applying memory research theories to understand individual and situational differences in learning processes. | ERC Consolid... | € 2.000.000 | 2022 | Details |
DEep COgnition Learning for LAnguage GEnerationThis project aims to enhance NLP models by integrating machine learning, cognitive science, and structured memory to improve out-of-domain generalization and contextual understanding in language generation tasks. | ERC Consolid... | € 1.999.595 | 2023 | Details |
Controlling Large Language ModelsDevelop a framework to understand and control large language models, addressing biases and flaws to ensure safe and responsible AI adoption. | ERC Starting... | € 1.500.000 | 2024 | Details |
Gates to Language
The GALA project investigates the biological mechanisms of language acquisition in humans and nonhuman species to uncover why only humans can learn language.
Infant verbal Memory in Development: a window for understanding language constraints and brain plasticity from birth
IN-MIND investigates the development of verbal memory in infants to understand its role in language learning, using innovative methods to identify memory capacities and intervention windows.
Multiple routes to memory for a second language: Individual and situational factors
This project investigates alternative routes to second language acquisition by applying memory research theories to understand individual and situational differences in learning processes.
DEep COgnition Learning for LAnguage GEneration
This project aims to enhance NLP models by integrating machine learning, cognitive science, and structured memory to improve out-of-domain generalization and contextual understanding in language generation tasks.
Controlling Large Language Models
Develop a framework to understand and control large language models, addressing biases and flaws to ensure safe and responsible AI adoption.
Vergelijkbare projecten uit andere regelingen
Project | Regeling | Bedrag | Jaar | Actie |
---|---|---|---|---|
e-LEARN-ITHet project ontwikkelt een in-ear leer-assistent voor gepersonaliseerd, bewegend onderwijs en logopedie met innovatieve spraaktechnologie. | Mkb-innovati... | € 200.000 | 2020 | Details |
Haalbaarheidsonderzoek naar AIPerLearn (AI-Powered Personalized Learning)STARK Learning onderzoekt de toepassing en training van AI-modellen om het ontwikkelen van gepersonaliseerde lesmaterialen te automatiseren en de kwaliteit en validatie te waarborgen. | Mkb-innovati... | € 20.000 | 2023 | Details |
e-LEARN-IT
Het project ontwikkelt een in-ear leer-assistent voor gepersonaliseerd, bewegend onderwijs en logopedie met innovatieve spraaktechnologie.
Haalbaarheidsonderzoek naar AIPerLearn (AI-Powered Personalized Learning)
STARK Learning onderzoekt de toepassing en training van AI-modellen om het ontwikkelen van gepersonaliseerde lesmaterialen te automatiseren en de kwaliteit en validatie te waarborgen.