Why do infants learn language so fast? A reverse engineering approach

This project develops a computational model to explore how infants efficiently learn language through statistical learning and three additional mechanisms, aiming to produce comparable outcomes to children's language acquisition.

Subsidie

€ 2.494.625

2025

Projectdetails

Introduction

How do infants learn their first language(s)? The popular yet controversial 'statistical learning hypothesis' posits that they learn by gradually collecting statistics over their language inputs. This is strikingly similar to how current AI's Large Language Models (LLMs) learn and shows that simple statistical mechanisms may be sufficient to attain adult-like language competence.

Learning Data Comparison

But does it? Estimates of language inputs to children show that by age 3, they have received 2 or 3 orders of magnitude less data than LLMs of similar performance. The gap grows exponentially larger with children's age. Worse, when models are fed with speech instead of text, they learn even slower. How are infants so efficient learners?

Hypothesis Testing

This project tests the hypothesis that in addition to statistical learning, infants benefit from three mechanisms that accelerate their learning rate:

They are born with a vocal tract which helps them understand the link between abstract motor commands and speech sounds, and decode noisy speech inputs more efficiently.
They have an episodic memory enabling them to learn from unique events, instead of gradually learning from thousands of repetitions.
They start with an evolved learning architecture optimized for generalization from few and noisy inputs.

Methodology

Our approach is to build a computational model of the learner (an infant simulator), which when fed realistic language input produces outcome measures comparable to children's (laboratory experiments, vocabulary estimates). This gives a quantitative estimate of the efficiency of each of the three mechanisms, as well as new testable predictions.

Language Focus

We start with English and French that have both accessible large annotated speech corpora and documented acquisition landmarks and focus on the first three years of life. We then help build similar resources across a larger set of languages by fostering a cross-disciplinary community that shares tools, data, and analysis methods.

Financiële details & Tijdlijn

Financiële details

Subsidiebedrag	€ 2.494.625
Totale projectbegroting	€ 2.494.625

Tijdlijn

Startdatum	1-1-2025
Einddatum	31-12-2029
Subsidiejaar	2025

Partners & Locaties

Projectpartners

ECOLE DES HAUTES ETUDES EN SCIENCES SOCIALESpenvoerder
ECOLE NORMALE SUPERIEURE

Land(en)

France

Vergelijkbare projecten binnen European Research Council

Project	Regeling	Bedrag	Jaar	Actie
Gates to Language The GALA project investigates the biological mechanisms of language acquisition in humans and nonhuman species to uncover why only humans can learn language.	ERC Starting...	€ 1.490.057	2024	Details
Infant verbal Memory in Development: a window for understanding language constraints and brain plasticity from birth IN-MIND investigates the development of verbal memory in infants to understand its role in language learning, using innovative methods to identify memory capacities and intervention windows.	ERC Starting...	€ 1.499.798	2022	Details
Multiple routes to memory for a second language: Individual and situational factors This project investigates alternative routes to second language acquisition by applying memory research theories to understand individual and situational differences in learning processes.	ERC Consolid...	€ 2.000.000	2022	Details
DEep COgnition Learning for LAnguage GEneration This project aims to enhance NLP models by integrating machine learning, cognitive science, and structured memory to improve out-of-domain generalization and contextual understanding in language generation tasks.	ERC Consolid...	€ 1.999.595	2023	Details
Controlling Large Language Models Develop a framework to understand and control large language models, addressing biases and flaws to ensure safe and responsible AI adoption.	ERC Starting...	€ 1.500.000	2024	Details

ERC Starting...

Gates to Language

The GALA project investigates the biological mechanisms of language acquisition in humans and nonhuman species to uncover why only humans can learn language.

ERC Starting Grant

€ 1.490.057

2024

Details

ERC Starting...

Infant verbal Memory in Development: a window for understanding language constraints and brain plasticity from birth

IN-MIND investigates the development of verbal memory in infants to understand its role in language learning, using innovative methods to identify memory capacities and intervention windows.

ERC Starting Grant

€ 1.499.798

2022

Details

ERC Consolid...

Multiple routes to memory for a second language: Individual and situational factors

This project investigates alternative routes to second language acquisition by applying memory research theories to understand individual and situational differences in learning processes.

ERC Consolidator Grant

€ 2.000.000

2022

Details

ERC Consolid...

DEep COgnition Learning for LAnguage GEneration

This project aims to enhance NLP models by integrating machine learning, cognitive science, and structured memory to improve out-of-domain generalization and contextual understanding in language generation tasks.

ERC Consolidator Grant

€ 1.999.595

2023

Details

ERC Starting...

Controlling Large Language Models

Develop a framework to understand and control large language models, addressing biases and flaws to ensure safe and responsible AI adoption.

ERC Starting Grant

€ 1.500.000

2024

Details

Vergelijkbare projecten uit andere regelingen

Project	Regeling	Bedrag	Jaar	Actie
e-LEARN-IT Het project ontwikkelt een in-ear leer-assistent voor gepersonaliseerd, bewegend onderwijs en logopedie met innovatieve spraaktechnologie.	Mkb-innovati...	€ 200.000	2020	Details
Haalbaarheidsonderzoek naar AIPerLearn (AI-Powered Personalized Learning) STARK Learning onderzoekt de toepassing en training van AI-modellen om het ontwikkelen van gepersonaliseerde lesmaterialen te automatiseren en de kwaliteit en validatie te waarborgen.	Mkb-innovati...	€ 20.000	2023	Details

Mkb-innovati...

e-LEARN-IT

Het project ontwikkelt een in-ear leer-assistent voor gepersonaliseerd, bewegend onderwijs en logopedie met innovatieve spraaktechnologie.

Mkb-innovatiestimulering Topsectoren R&D Samenwerking

€ 200.000

2020

Details

Mkb-innovati...

Haalbaarheidsonderzoek naar AIPerLearn (AI-Powered Personalized Learning)

STARK Learning onderzoekt de toepassing en training van AI-modellen om het ontwikkelen van gepersonaliseerde lesmaterialen te automatiseren en de kwaliteit en validatie te waarborgen.

Mkb-innovatiestimulering Topsectoren Haalbaarheid

€ 20.000

2023

Details

Why do infants learn language so fast? A reverse engineering approach

Subsidie

€ 2.494.625

2025

Projectdetails

Introduction

Learning Data Comparison

Hypothesis Testing

This project tests the hypothesis that in addition to statistical learning, infants benefit from three mechanisms that accelerate their learning rate:

They are born with a vocal tract which helps them understand the link between abstract motor commands and speech sounds, and decode noisy speech inputs more efficiently.
They have an episodic memory enabling them to learn from unique events, instead of gradually learning from thousands of repetitions.
They start with an evolved learning architecture optimized for generalization from few and noisy inputs.

Methodology

Language Focus

Financiële details & Tijdlijn

Financiële details

Subsidiebedrag	€ 2.494.625
Totale projectbegroting	€ 2.494.625

Tijdlijn

Startdatum	1-1-2025
Einddatum	31-12-2029
Subsidiejaar	2025

Partners & Locaties

Projectpartners

ECOLE DES HAUTES ETUDES EN SCIENCES SOCIALESpenvoerder
ECOLE NORMALE SUPERIEURE

Land(en)

France

Vergelijkbare projecten binnen European Research Council

Project	Regeling	Bedrag	Jaar	Actie
Gates to Language The GALA project investigates the biological mechanisms of language acquisition in humans and nonhuman species to uncover why only humans can learn language.	ERC Starting...	€ 1.490.057	2024	Details
Infant verbal Memory in Development: a window for understanding language constraints and brain plasticity from birth IN-MIND investigates the development of verbal memory in infants to understand its role in language learning, using innovative methods to identify memory capacities and intervention windows.	ERC Starting...	€ 1.499.798	2022	Details
Multiple routes to memory for a second language: Individual and situational factors This project investigates alternative routes to second language acquisition by applying memory research theories to understand individual and situational differences in learning processes.	ERC Consolid...	€ 2.000.000	2022	Details
DEep COgnition Learning for LAnguage GEneration This project aims to enhance NLP models by integrating machine learning, cognitive science, and structured memory to improve out-of-domain generalization and contextual understanding in language generation tasks.	ERC Consolid...	€ 1.999.595	2023	Details
Controlling Large Language Models Develop a framework to understand and control large language models, addressing biases and flaws to ensure safe and responsible AI adoption.	ERC Starting...	€ 1.500.000	2024	Details

ERC Starting...

Gates to Language

The GALA project investigates the biological mechanisms of language acquisition in humans and nonhuman species to uncover why only humans can learn language.

ERC Starting Grant

€ 1.490.057

2024

Details

ERC Starting...

Infant verbal Memory in Development: a window for understanding language constraints and brain plasticity from birth

IN-MIND investigates the development of verbal memory in infants to understand its role in language learning, using innovative methods to identify memory capacities and intervention windows.

ERC Starting Grant

€ 1.499.798

2022

Details

ERC Consolid...

Multiple routes to memory for a second language: Individual and situational factors

This project investigates alternative routes to second language acquisition by applying memory research theories to understand individual and situational differences in learning processes.

ERC Consolidator Grant

€ 2.000.000

2022

Details

ERC Consolid...

DEep COgnition Learning for LAnguage GEneration

ERC Consolidator Grant

€ 1.999.595

2023

Details

ERC Starting...

Controlling Large Language Models

Develop a framework to understand and control large language models, addressing biases and flaws to ensure safe and responsible AI adoption.

ERC Starting Grant

€ 1.500.000

2024

Details

Vergelijkbare projecten uit andere regelingen

Project	Regeling	Bedrag	Jaar	Actie
e-LEARN-IT Het project ontwikkelt een in-ear leer-assistent voor gepersonaliseerd, bewegend onderwijs en logopedie met innovatieve spraaktechnologie.	Mkb-innovati...	€ 200.000	2020	Details
Haalbaarheidsonderzoek naar AIPerLearn (AI-Powered Personalized Learning) STARK Learning onderzoekt de toepassing en training van AI-modellen om het ontwikkelen van gepersonaliseerde lesmaterialen te automatiseren en de kwaliteit en validatie te waarborgen.	Mkb-innovati...	€ 20.000	2023	Details

Mkb-innovati...

e-LEARN-IT

Het project ontwikkelt een in-ear leer-assistent voor gepersonaliseerd, bewegend onderwijs en logopedie met innovatieve spraaktechnologie.

Mkb-innovatiestimulering Topsectoren R&D Samenwerking

€ 200.000

2020

Details

Mkb-innovati...

Haalbaarheidsonderzoek naar AIPerLearn (AI-Powered Personalized Learning)

STARK Learning onderzoekt de toepassing en training van AI-modellen om het ontwikkelen van gepersonaliseerde lesmaterialen te automatiseren en de kwaliteit en validatie te waarborgen.

Mkb-innovatiestimulering Topsectoren Haalbaarheid

€ 20.000

2023

Details