Why do infants learn language so fast? A reverse engineering approach

This project develops a computational model to explore how infants efficiently learn language through statistical learning and three additional mechanisms, aiming to produce comparable outcomes to children's language acquisition.

Subsidie
€ 2.494.625
2025

Projectdetails

Introduction

How do infants learn their first language(s)? The popular yet controversial 'statistical learning hypothesis' posits that they learn by gradually collecting statistics over their language inputs. This is strikingly similar to how current AI's Large Language Models (LLMs) learn and shows that simple statistical mechanisms may be sufficient to attain adult-like language competence.

Learning Data Comparison

But does it? Estimates of language inputs to children show that by age 3, they have received 2 or 3 orders of magnitude less data than LLMs of similar performance. The gap grows exponentially larger with children's age. Worse, when models are fed with speech instead of text, they learn even slower. How are infants so efficient learners?

Hypothesis Testing

This project tests the hypothesis that in addition to statistical learning, infants benefit from three mechanisms that accelerate their learning rate:

  1. They are born with a vocal tract which helps them understand the link between abstract motor commands and speech sounds, and decode noisy speech inputs more efficiently.
  2. They have an episodic memory enabling them to learn from unique events, instead of gradually learning from thousands of repetitions.
  3. They start with an evolved learning architecture optimized for generalization from few and noisy inputs.

Methodology

Our approach is to build a computational model of the learner (an infant simulator), which when fed realistic language input produces outcome measures comparable to children's (laboratory experiments, vocabulary estimates). This gives a quantitative estimate of the efficiency of each of the three mechanisms, as well as new testable predictions.

Language Focus

We start with English and French that have both accessible large annotated speech corpora and documented acquisition landmarks and focus on the first three years of life. We then help build similar resources across a larger set of languages by fostering a cross-disciplinary community that shares tools, data, and analysis methods.

Financiële details & Tijdlijn

Financiële details

Subsidiebedrag€ 2.494.625
Totale projectbegroting€ 2.494.625

Tijdlijn

Startdatum1-1-2025
Einddatum31-12-2029
Subsidiejaar2025

Partners & Locaties

Projectpartners

  • ECOLE DES HAUTES ETUDES EN SCIENCES SOCIALESpenvoerder
  • ECOLE NORMALE SUPERIEURE

Land(en)

France

Vergelijkbare projecten binnen European Research Council

ERC Starting...

Gates to Language

The GALA project investigates the biological mechanisms of language acquisition in humans and nonhuman species to uncover why only humans can learn language.

€ 1.490.057
ERC Starting...

Infant verbal Memory in Development: a window for understanding language constraints and brain plasticity from birth

IN-MIND investigates the development of verbal memory in infants to understand its role in language learning, using innovative methods to identify memory capacities and intervention windows.

€ 1.499.798
ERC Consolid...

Multiple routes to memory for a second language: Individual and situational factors

This project investigates alternative routes to second language acquisition by applying memory research theories to understand individual and situational differences in learning processes.

€ 2.000.000
ERC Consolid...

DEep COgnition Learning for LAnguage GEneration

This project aims to enhance NLP models by integrating machine learning, cognitive science, and structured memory to improve out-of-domain generalization and contextual understanding in language generation tasks.

€ 1.999.595
ERC Starting...

Controlling Large Language Models

Develop a framework to understand and control large language models, addressing biases and flaws to ensure safe and responsible AI adoption.

€ 1.500.000

Vergelijkbare projecten uit andere regelingen

Mkb-innovati...

e-LEARN-IT

Het project ontwikkelt een in-ear leer-assistent voor gepersonaliseerd, bewegend onderwijs en logopedie met innovatieve spraaktechnologie.

€ 200.000
Mkb-innovati...

Haalbaarheidsonderzoek naar AIPerLearn (AI-Powered Personalized Learning)

STARK Learning onderzoekt de toepassing en training van AI-modellen om het ontwikkelen van gepersonaliseerde lesmaterialen te automatiseren en de kwaliteit en validatie te waarborgen.

€ 20.000