Hybrid and Interpretable Deep neural audio machines

HI-Audio aims to develop hybrid deep learning models that integrate interpretable signal processing with neural architectures for enhanced audio analysis and synthesis applications.

Subsidie
€ 2.482.317
2022

Projectdetails

Introduction

Machine Listening, or AI for Sound, is defined as the general field of Artificial Intelligence applied to audio analysis, understanding, and synthesis by a machine. The access to ever-increasing super-computing facilities, combined with the availability of huge data repositories (although largely unannotated), has led to the emergence of a significant trend with pure data-driven machine learning approaches.

Trends in Machine Listening

The field has rapidly moved towards end-to-end neural approaches which aim to directly solve the machine learning problem for raw acoustic signals. However, these approaches often only loosely take into account the nature and structure of the processed data.

Consequences of Current Approaches

The main consequences are that the models are:

  1. Overly complex, requiring massive amounts of data to be trained and extreme computing power to be efficient (in terms of task performance).
  2. Largely unexplainable and non-interpretable.

Proposed Solutions

To overcome these major shortcomings, we believe that our prior knowledge about the nature of the processed data, their generation process, and their perception by humans should be explicitly exploited in neural-based machine learning frameworks.

Project Aim

The aim of HI-Audio is to build such hybrid deep approaches combining:

  • Parameter-efficient and interpretable signal models
  • Musicological and physics-based models
  • Highly tailored, deep neural architectures

Research Directions

The research directions pursued in HI-Audio will exploit novel deterministic and statistical audio and sound environment models with dedicated neural auto-encoders and generative networks. The project will target specific applications including:

  • Speech and audio scene analysis
  • Music information retrieval
  • Sound transformation and synthesis

Financiële details & Tijdlijn

Financiële details

Subsidiebedrag€ 2.482.317
Totale projectbegroting€ 2.482.317

Tijdlijn

Startdatum1-10-2022
Einddatum30-9-2027
Subsidiejaar2022

Partners & Locaties

Projectpartners

  • INSTITUT MINES-TELECOMpenvoerder

Land(en)

France

Vergelijkbare projecten binnen European Research Council

ERC Synergy ...

Natural Auditory SCEnes in Humans and Machines: Establishing the Neural Computations of Everyday Hearing

The NASCE project aims to understand auditory scene analysis by developing the Semantic Segmentation Hypothesis, integrating neuroscience and AI to enhance comprehension and applications in machine hearing.

€ 8.622.811
ERC Consolid...

Reconciling Classical and Modern (Deep) Machine Learning for Real-World Applications

APHELEIA aims to create robust, interpretable, and efficient machine learning models that require less data by integrating classical methods with modern deep learning, fostering interdisciplinary collaboration.

€ 1.999.375
ERC Starting...

Interactive and Explainable Human-Centered AutoML

ixAutoML aims to enhance trust and interactivity in automated machine learning by integrating human insights and explanations, fostering democratization and efficiency in ML applications.

€ 1.459.763
ERC Advanced...

Deep Culture - Living with Difference in the Age of Deep Learning

DEEP CULTURE aims to critically explore the intersection of deep learning and cultural production through an interdisciplinary framework, fostering new methodologies and public engagement.

€ 2.500.000
ERC Starting...

Inference in High Dimensions: Light-speed Algorithms and Information Limits

The INF^2 project develops information-theoretically grounded methods for efficient high-dimensional inference in machine learning, aiming to reduce costs and enhance interpretability in applications like genome-wide studies.

€ 1.662.400

Vergelijkbare projecten uit andere regelingen

Mkb-innovati...

Semi-supervised learning voor dynamische geluidkaart in gebouwde omgeving

MuniSense, Peutz en Embedded Acoustics ontwikkelen een dynamische geluidskaart met AI, gericht op geluidsmonitoring in de gebouwde omgeving, met nadruk op betrouwbare, mensgerichte participatie.

€ 285.180
Mkb-innovati...

Een standaard voor productiewaardige Deep Learning systemen

Het project richt zich op het verbeteren van audio- en video-analyse systemen door samenwerking tussen Media Distillery, NovoLanguage en een partner, met als doel hogere kwaliteit en snellere ontwikkeling via gedeelde technologieën.

€ 104.061
Mkb-innovati...

Project Hominis

Het project richt zich op het ontwikkelen van een ethisch AI-systeem voor natuurlijke taalverwerking dat vooroordelen minimaliseert en technische, economische en regelgevingsrisico's beheert.

€ 20.000
Mkb-innovati...

RESONIKS

Het project gebruikt Machine Learning en AI voor snelle, operatorloze akoestische kwaliteitsinspecties, wat de doorlooptijd en kwaliteit verbetert en de reputatie en omzet van bedrijven verhoogt.

€ 188.300
EIC Accelerator

Neuron Soundware: detecting machine failures early combining sound, AI and IoT technologies

NSW is an AI-driven diagnostic technology that detects machine faults early through acoustic analysis, enhancing industrial sustainability with 99.6% accuracy and minimizing downtime.

€ 2.500.000