Omni-Supervised Learning for Dynamic Scene Understanding

This project aims to enhance dynamic scene understanding in autonomous vehicles by developing innovative machine learning models and methods for open-world object recognition from unlabeled video data.

Subsidie
€ 1.500.000
2023

Projectdetails

Introduction

Computer vision has become a powerful technology, able to bring applications such as autonomous vehicles and social robots closer to reality. In order for autonomous vehicles to safely navigate a scene, they need to understand the dynamic objects around them.

Dynamic Scene Understanding

In other words, we need computer vision algorithms to perform dynamic scene understanding (DSU), i.e., detection, segmentation, and tracking of multiple moving objects in a scene. This is an essential feature for higher-level tasks such as action recognition or decision making for autonomous vehicles.

Challenges in Current Models

Much of the success of computer vision models for DSU has been driven by the rise of deep learning, in particular, convolutional neural networks trained on large-scale datasets in a supervised way. However, the closed-world created by our datasets is not an accurate representation of the real world.

Limitations of Annotated Object Classes

If our methods only work on annotated object classes, what happens if a new object appears in front of an autonomous vehicle?

Proposed Solutions

We propose to rethink the deep learning models we use, the way we obtain data annotations, as well as the generalization of our models to previously unseen object classes. To bring all the power of computer vision algorithms for DSU to the open-world, we will focus on three lines of research:

  1. Models: We will design novel machine learning models to address the shortcomings of convolutional neural networks. A hierarchical (from pixels to objects) image-dependent representation will allow us to capture spatio-temporal dependencies at all levels of the hierarchy.

  2. Data: To train our models, we will create a new large-scale DSU synthetic dataset and propose novel methods to mitigate the annotation costs for video data.

  3. Open-World: To bring DSU to the open-world, we will design methods that learn directly from unlabeled video streams. Our models will be able to detect, segment, retrieve, and track dynamic objects coming from classes never previously observed during the training of our models.

Financiële details & Tijdlijn

Financiële details

Subsidiebedrag€ 1.500.000
Totale projectbegroting€ 1.500.000

Tijdlijn

Startdatum1-1-2023
Einddatum31-12-2027
Subsidiejaar2023

Partners & Locaties

Projectpartners

  • NVIDIA ITALY S.R.L.penvoerder

Land(en)

Italy

Vergelijkbare projecten binnen European Research Council

ERC Starting...

Neural OmniVideo: Fusing World Knowledge into Smart Video-Specific Models

Develop Neural OmniVideo Models to enhance video analysis and synthesis by integrating deep learning frameworks with external knowledge for improved representation and understanding of dynamic content.

€ 1.500.000
ERC Advanced...

Exploration of Unknown Environments for Digital Twins

The 'explorer' project aims to automate video data capture and labeling in open worlds to facilitate the creation of semantically rich Digital Twins for complex environments using AI-driven methods.

€ 2.476.718
ERC Advanced...

Federated foundational models for embodied perception

The FRONTIER project aims to develop foundational models for embodied perception by integrating neural networks with physical simulations, enhancing learning efficiency and collaboration across intelligent systems.

€ 2.499.825
ERC Starting...

Spatial 3D Semantic Understanding for Perception in the Wild

The project aims to develop new algorithms for robust 3D visual perception and semantic understanding from 2D images, enhancing machine perception and immersive technologies.

€ 1.500.000
ERC Consolid...

Learning to synthesize interactive 3D models

This project aims to automate the generation of interactive 3D models using deep learning to enhance virtual environments and applications in animation, robotics, and digital entertainment.

€ 2.000.000

Vergelijkbare projecten uit andere regelingen

Mkb-innovati...

Deep Learning for Advanced Robot Motion Planning

Het project onderzoekt de haalbaarheid van een trainingsmethode voor robots in complexe, onvoorspelbare omgevingen.

€ 20.000
Mkb-innovati...

CrimeSense

Het project onderzoekt de haalbaarheid van een automatisch geweldsdetectiesysteem voor bewakingscamera's op basis van virtual gaming data.

€ 20.000