Omni-Supervised Learning for Dynamic Scene Understanding
This project aims to enhance dynamic scene understanding in autonomous vehicles by developing innovative machine learning models and methods for open-world object recognition from unlabeled video data.
Projectdetails
Introduction
Computer vision has become a powerful technology, able to bring applications such as autonomous vehicles and social robots closer to reality. In order for autonomous vehicles to safely navigate a scene, they need to understand the dynamic objects around them.
Dynamic Scene Understanding
In other words, we need computer vision algorithms to perform dynamic scene understanding (DSU), i.e., detection, segmentation, and tracking of multiple moving objects in a scene. This is an essential feature for higher-level tasks such as action recognition or decision making for autonomous vehicles.
Challenges in Current Models
Much of the success of computer vision models for DSU has been driven by the rise of deep learning, in particular, convolutional neural networks trained on large-scale datasets in a supervised way. However, the closed-world created by our datasets is not an accurate representation of the real world.
Limitations of Annotated Object Classes
If our methods only work on annotated object classes, what happens if a new object appears in front of an autonomous vehicle?
Proposed Solutions
We propose to rethink the deep learning models we use, the way we obtain data annotations, as well as the generalization of our models to previously unseen object classes. To bring all the power of computer vision algorithms for DSU to the open-world, we will focus on three lines of research:
-
Models: We will design novel machine learning models to address the shortcomings of convolutional neural networks. A hierarchical (from pixels to objects) image-dependent representation will allow us to capture spatio-temporal dependencies at all levels of the hierarchy.
-
Data: To train our models, we will create a new large-scale DSU synthetic dataset and propose novel methods to mitigate the annotation costs for video data.
-
Open-World: To bring DSU to the open-world, we will design methods that learn directly from unlabeled video streams. Our models will be able to detect, segment, retrieve, and track dynamic objects coming from classes never previously observed during the training of our models.
Financiële details & Tijdlijn
Financiële details
Subsidiebedrag | € 1.500.000 |
Totale projectbegroting | € 1.500.000 |
Tijdlijn
Startdatum | 1-1-2023 |
Einddatum | 31-12-2027 |
Subsidiejaar | 2023 |
Partners & Locaties
Projectpartners
- NVIDIA ITALY S.R.L.penvoerder
Land(en)
Vergelijkbare projecten binnen European Research Council
Project | Regeling | Bedrag | Jaar | Actie |
---|---|---|---|---|
Neural OmniVideo: Fusing World Knowledge into Smart Video-Specific ModelsDevelop Neural OmniVideo Models to enhance video analysis and synthesis by integrating deep learning frameworks with external knowledge for improved representation and understanding of dynamic content. | ERC Starting... | € 1.500.000 | 2024 | Details |
Exploration of Unknown Environments for Digital TwinsThe 'explorer' project aims to automate video data capture and labeling in open worlds to facilitate the creation of semantically rich Digital Twins for complex environments using AI-driven methods. | ERC Advanced... | € 2.476.718 | 2023 | Details |
Federated foundational models for embodied perceptionThe FRONTIER project aims to develop foundational models for embodied perception by integrating neural networks with physical simulations, enhancing learning efficiency and collaboration across intelligent systems. | ERC Advanced... | € 2.499.825 | 2024 | Details |
Spatial 3D Semantic Understanding for Perception in the WildThe project aims to develop new algorithms for robust 3D visual perception and semantic understanding from 2D images, enhancing machine perception and immersive technologies. | ERC Starting... | € 1.500.000 | 2023 | Details |
Learning to synthesize interactive 3D modelsThis project aims to automate the generation of interactive 3D models using deep learning to enhance virtual environments and applications in animation, robotics, and digital entertainment. | ERC Consolid... | € 2.000.000 | 2024 | Details |
Neural OmniVideo: Fusing World Knowledge into Smart Video-Specific Models
Develop Neural OmniVideo Models to enhance video analysis and synthesis by integrating deep learning frameworks with external knowledge for improved representation and understanding of dynamic content.
Exploration of Unknown Environments for Digital Twins
The 'explorer' project aims to automate video data capture and labeling in open worlds to facilitate the creation of semantically rich Digital Twins for complex environments using AI-driven methods.
Federated foundational models for embodied perception
The FRONTIER project aims to develop foundational models for embodied perception by integrating neural networks with physical simulations, enhancing learning efficiency and collaboration across intelligent systems.
Spatial 3D Semantic Understanding for Perception in the Wild
The project aims to develop new algorithms for robust 3D visual perception and semantic understanding from 2D images, enhancing machine perception and immersive technologies.
Learning to synthesize interactive 3D models
This project aims to automate the generation of interactive 3D models using deep learning to enhance virtual environments and applications in animation, robotics, and digital entertainment.
Vergelijkbare projecten uit andere regelingen
Project | Regeling | Bedrag | Jaar | Actie |
---|---|---|---|---|
Deep Learning for Advanced Robot Motion PlanningHet project onderzoekt de haalbaarheid van een trainingsmethode voor robots in complexe, onvoorspelbare omgevingen. | Mkb-innovati... | € 20.000 | 2021 | Details |
CrimeSenseHet project onderzoekt de haalbaarheid van een automatisch geweldsdetectiesysteem voor bewakingscamera's op basis van virtual gaming data. | Mkb-innovati... | € 20.000 | 2022 | Details |
Deep Learning for Advanced Robot Motion Planning
Het project onderzoekt de haalbaarheid van een trainingsmethode voor robots in complexe, onvoorspelbare omgevingen.
CrimeSense
Het project onderzoekt de haalbaarheid van een automatisch geweldsdetectiesysteem voor bewakingscamera's op basis van virtual gaming data.