Scalable Learning for Reproducibility in High-Dimensional Biomedical Signal Processing: A Robust Data Science Framework

ScReeningData aims to develop a scalable learning framework to enhance statistical robustness and reproducibility in high-dimensional data analysis, reducing false positives across scientific domains.

Subsidie

€ 1.500.000

2022

Projectdetails

Introduction

Data science has quickly expanded the boundaries of signal processing and statistical learning beyond their accustomed domains. Powerful and complex machine learning architectures have evolved to distinguish relevant information from randomness, artifacts, and irrelevant data.

Challenges in Existing Frameworks

However, existing learning frameworks lack computationally scalable, tractable, and robust methods for high-dimensional data. Consequently, discoveries, for example, in genomic data can be the result of coincidental findings that happen to reach statistical significance.

Need for Appropriate Learning Frameworks

As long as groundbreaking advances in biotechnology are not accompanied by appropriate learning frameworks, valuable efforts are spent on researching false positives.

Project Overview

ScReeningData develops a coherent, fast, and scalable learning framework that jointly addresses the fundamental challenges of:

Drastically reducing computational complexity
Providing statistical and robustness guarantees
Quantifying reproducibility in large-scale and high-dimensional settings

Innovative Approach

An unprecedented approach is developed that builds upon very recent work of the PI. The underlying concept is to repeat randomized controlled experiments that use computer-generated fake variables as negative controls to trigger an early stopping of the learning algorithms, thereby mitigating the so-called curse of dimensionality.

Advantages of Proposed Methods

In contrast to existing methods, the proposed methods are completely tractable and scalable to ultra-high dimensions. The gains of developing advanced robust learning methods that are computed ultra-fast and with tight guarantees on the targeted rate of false positives are enormous.

Impact on Scientific Domains

They lead to new reproducible discoveries that can be made with high statistical power. Due to the fundamental nature and the broad applicability of the proposed learning methods, the impacts of this project extend far beyond the considered biomedical signal processing use-cases, benefiting all scientific domains that analyze high-dimensional data.

Financiële details & Tijdlijn

Financiële details

Subsidiebedrag	€ 1.500.000
Totale projectbegroting	€ 1.500.000

Tijdlijn

Startdatum	1-9-2022
Einddatum	31-8-2027
Subsidiejaar	2022

Partners & Locaties

Projectpartners

TECHNISCHE UNIVERSITAT DARMSTADTpenvoerder

Land(en)

Germany

Vergelijkbare projecten binnen European Research Council

Project	Regeling	Bedrag	Jaar	Actie
Reconciling Classical and Modern (Deep) Machine Learning for Real-World Applications APHELEIA aims to create robust, interpretable, and efficient machine learning models that require less data by integrating classical methods with modern deep learning, fostering interdisciplinary collaboration.	ERC Consolid...	€ 1.999.375	2023	Details
Provable Scalability for high-dimensional Bayesian Learning This project develops a mathematical theory for scalable Bayesian learning methods, integrating computational and statistical insights to enhance algorithm efficiency and applicability in high-dimensional models.	ERC Starting...	€ 1.488.673	2023	Details
Inference in High Dimensions: Light-speed Algorithms and Information Limits The INF^2 project develops information-theoretically grounded methods for efficient high-dimensional inference in machine learning, aiming to reduce costs and enhance interpretability in applications like genome-wide studies.	ERC Starting...	€ 1.662.400	2024	Details
The missing mathematical story of Bayesian uncertainty quantification for big data This project aims to enhance scalable Bayesian methods through theoretical insights, improving their accuracy and acceptance in real-world applications like medicine and cosmology.	ERC Starting...	€ 1.492.750	2022	Details
Cascade Processes for Sparse Machine Learning The project aims to democratize deep learning by developing small-scale, resource-efficient models that enhance fairness, robustness, and interpretability through innovative techniques from statistical physics and network science.	ERC Starting...	€ 1.499.285	2023	Details

ERC Consolid...

Reconciling Classical and Modern (Deep) Machine Learning for Real-World Applications

APHELEIA aims to create robust, interpretable, and efficient machine learning models that require less data by integrating classical methods with modern deep learning, fostering interdisciplinary collaboration.

ERC Consolidator Grant

€ 1.999.375

2023

Details

ERC Starting...

Provable Scalability for high-dimensional Bayesian Learning

This project develops a mathematical theory for scalable Bayesian learning methods, integrating computational and statistical insights to enhance algorithm efficiency and applicability in high-dimensional models.

ERC Starting Grant

€ 1.488.673

2023

Details

ERC Starting...

Inference in High Dimensions: Light-speed Algorithms and Information Limits

The INF^2 project develops information-theoretically grounded methods for efficient high-dimensional inference in machine learning, aiming to reduce costs and enhance interpretability in applications like genome-wide studies.

ERC Starting Grant

€ 1.662.400

2024

Details

ERC Starting...

The missing mathematical story of Bayesian uncertainty quantification for big data

This project aims to enhance scalable Bayesian methods through theoretical insights, improving their accuracy and acceptance in real-world applications like medicine and cosmology.

ERC Starting Grant

€ 1.492.750

2022

Details

ERC Starting...

Cascade Processes for Sparse Machine Learning

The project aims to democratize deep learning by developing small-scale, resource-efficient models that enhance fairness, robustness, and interpretability through innovative techniques from statistical physics and network science.

ERC Starting Grant

€ 1.499.285

2023

Details

Projectdetails

Introduction

Challenges in Existing Frameworks

Need for Appropriate Learning Frameworks

As long as groundbreaking advances in biotechnology are not accompanied by appropriate learning frameworks, valuable efforts are spent on researching false positives.

Project Overview

ScReeningData develops a coherent, fast, and scalable learning framework that jointly addresses the fundamental challenges of:

Drastically reducing computational complexity
Providing statistical and robustness guarantees
Quantifying reproducibility in large-scale and high-dimensional settings

Innovative Approach

Advantages of Proposed Methods

Impact on Scientific Domains

Vergelijkbare projecten binnen European Research Council

Project	Regeling	Bedrag	Jaar	Actie
Reconciling Classical and Modern (Deep) Machine Learning for Real-World Applications APHELEIA aims to create robust, interpretable, and efficient machine learning models that require less data by integrating classical methods with modern deep learning, fostering interdisciplinary collaboration.	ERC Consolid...	€ 1.999.375	2023	Details
Provable Scalability for high-dimensional Bayesian Learning This project develops a mathematical theory for scalable Bayesian learning methods, integrating computational and statistical insights to enhance algorithm efficiency and applicability in high-dimensional models.	ERC Starting...	€ 1.488.673	2023	Details
Inference in High Dimensions: Light-speed Algorithms and Information Limits The INF^2 project develops information-theoretically grounded methods for efficient high-dimensional inference in machine learning, aiming to reduce costs and enhance interpretability in applications like genome-wide studies.	ERC Starting...	€ 1.662.400	2024	Details
The missing mathematical story of Bayesian uncertainty quantification for big data This project aims to enhance scalable Bayesian methods through theoretical insights, improving their accuracy and acceptance in real-world applications like medicine and cosmology.	ERC Starting...	€ 1.492.750	2022	Details
Cascade Processes for Sparse Machine Learning The project aims to democratize deep learning by developing small-scale, resource-efficient models that enhance fairness, robustness, and interpretability through innovative techniques from statistical physics and network science.	ERC Starting...	€ 1.499.285	2023	Details

ERC Consolid...

Reconciling Classical and Modern (Deep) Machine Learning for Real-World Applications

ERC Consolidator Grant

€ 1.999.375

2023

Details

ERC Starting...

Provable Scalability for high-dimensional Bayesian Learning

ERC Starting Grant

€ 1.488.673

2023

Details

ERC Starting...

Inference in High Dimensions: Light-speed Algorithms and Information Limits

ERC Starting Grant

€ 1.662.400

2024

Details

ERC Starting...

The missing mathematical story of Bayesian uncertainty quantification for big data

This project aims to enhance scalable Bayesian methods through theoretical insights, improving their accuracy and acceptance in real-world applications like medicine and cosmology.

ERC Starting Grant

€ 1.492.750

2022

Details

ERC Starting...

Cascade Processes for Sparse Machine Learning

ERC Starting Grant

€ 1.499.285

2023

Details