Federated foundational models for embodied perception

The FRONTIER project aims to develop foundational models for embodied perception by integrating neural networks with physical simulations, enhancing learning efficiency and collaboration across intelligent systems.

Subsidie
€ 2.499.825
2024

Projectdetails

Introduction

Computer vision is beginning to see a paradigm shift with large-scale foundational models that demonstrate impressive results on a wide range of recognition tasks. While achieving impressive results, these models learn only static 2D image representations based on observed correlations between still images and natural language. However, our world is three-dimensional, full of dynamic events and causal interactions.

Scientific Challenge

We argue that the next scientific challenge is to invent foundational models for embodied perception – that is perception for systems that have a physical body, operate in a dynamic 3D world, and interact with the surrounding environment.

FRONTIER Proposal

The FRONTIER proposal addresses this challenge by means of:

  1. Developing New Architectures
    Developing a new class of foundational model architectures grounded in the geometrical and physical structure of the world that seamlessly combine large-scale neural networks with learnable differentiable physical simulation components to achieve generalization across tasks, situations, and environments.

  2. Designing New Learning Algorithms
    Designing new learning algorithms that incorporate the physical and geometric structure as constraints on the learning process to achieve new levels of data efficiency with the aim of bringing intelligent systems closer to humans who can often learn from only a few available examples.

  3. Developing Federated Learning Methods
    Developing new federated learning methods that will allow sharing and accumulating learning experiences across different embodied systems, thereby achieving new levels of scale, accuracy, and robustness not achievable by learning in any individual system alone.

Implications

Breakthrough progress on these problems would have profound implications on our everyday lives as well as science and commerce. This includes:

  • Safer cars that learn from each other
  • Intelligent production lines that collaboratively adapt to new workflows
  • A new generation of smart assistive robots that automatically learn new skills from the Internet and each other

These advancements will be enabled by the progress from this project.

Financiële details & Tijdlijn

Financiële details

Subsidiebedrag€ 2.499.825
Totale projectbegroting€ 2.499.825

Tijdlijn

Startdatum1-1-2024
Einddatum31-12-2028
Subsidiejaar2024

Partners & Locaties

Projectpartners

  • CESKE VYSOKE UCENI TECHNICKE V PRAZEpenvoerder

Land(en)

Czechia

Vergelijkbare projecten binnen European Research Council

ERC Starting...

Structured Interactive Perception and Learning for Holistic Robotic Embodied Intelligence

SIREN proposes a holistic framework for robot learning that integrates action-perception cycles and modular graph representations to enhance adaptability and robustness in dynamic environments.

€ 1.499.738
ERC Consolid...

Learning to synthesize interactive 3D models

This project aims to automate the generation of interactive 3D models using deep learning to enhance virtual environments and applications in animation, robotics, and digital entertainment.

€ 2.000.000
ERC Starting...

Spatial 3D Semantic Understanding for Perception in the Wild

The project aims to develop new algorithms for robust 3D visual perception and semantic understanding from 2D images, enhancing machine perception and immersive technologies.

€ 1.500.000
ERC Consolid...

A theory and model of the neural transformations mediating human object perception

TRANSFORM aims to develop a predictive model and theory of neural transformations for object perception by integrating brain imaging, mathematical analysis, and computational modeling.

€ 2.291.855
ERC Starting...

Omni-Supervised Learning for Dynamic Scene Understanding

This project aims to enhance dynamic scene understanding in autonomous vehicles by developing innovative machine learning models and methods for open-world object recognition from unlabeled video data.

€ 1.500.000