Uniting Statistical Testing and Machine Learning for Safe Predictions
The project aims to enhance the interpretability and reliability of machine learning predictions by integrating statistical methods to establish robust error bounds and ensure safe deployment in real-world applications.
Projectdetails
Introduction
Recent breakthroughs in machine learning (ML) have brought about a transformative impact on decision-making, autonomous systems, medical diagnosis, and the creation of new scientific knowledge. However, this progress has a major drawback: modern predictive systems are extremely complex and hard to interpret, a problem known as the ‘black-box effect’.
Challenges of Black-Box Systems
The opaque nature of modern ML models, trained on increasingly diverse, incomplete, and noisy data, and later deployed in varying environments, hinders our ability to comprehend what drives inaccurate predictions, biased outcomes, and test-time failures.
Perhaps the most pressing question of our times is this: can we trust the predictions for future unseen instances obtained by black-box systems? The lack of practical guarantees on the limits of predictive performance poses a significant obstacle to deploying ML in applications that affect people's lives, opportunities, and science.
Goals and Objectives
My overarching goal is to put precise, interpretable, and robust error bounds on ML predictions, communicating rigorously what can be honestly inferred from data. I call for the development of protective ecosystems that can be seamlessly plugged into any ML model to monitor and guarantee its safety.
Methodology
This proposal introduces a unique interplay between statistics—the grammar of science—and ML—the art of learning from experience. Leveraging my expertise in both domains, I will show how statistical methodologies such as:
- Conformal prediction
- Test-martingales
can empower ML, and how recent breakthroughs in ML such as:
- Semi-supervised learning
- Domain adaptation technologies
can empower statistics.
Addressing Real-World Problems
I will tackle challenges rooted in real-world problems concerning:
- Availability of training data
- Quality of training data
- Test-time drifting data
Expected Outcomes
A successful outcome would not only lead to a timely and rigorous way toward safe ML but may also significantly reform the way we develop, deploy, and interact with learning systems.
Financiële details & Tijdlijn
Financiële details
Subsidiebedrag | € 1.500.000 |
Totale projectbegroting | € 1.500.000 |
Tijdlijn
Startdatum | 1-11-2024 |
Einddatum | 31-10-2029 |
Subsidiejaar | 2024 |
Partners & Locaties
Projectpartners
- TECHNION - ISRAEL INSTITUTE OF TECHNOLOGYpenvoerder
Land(en)
Vergelijkbare projecten binnen European Research Council
Project | Regeling | Bedrag | Jaar | Actie |
---|---|---|---|---|
Optimizing for Generalization in Machine LearningThis project aims to unravel the mystery of generalization in machine learning by developing novel optimization algorithms to enhance the reliability and applicability of ML in critical domains. | ERC Starting... | € 1.494.375 | 2023 | Details |
Controlling Large Language ModelsDevelop a framework to understand and control large language models, addressing biases and flaws to ensure safe and responsible AI adoption. | ERC Starting... | € 1.500.000 | 2024 | Details |
Dynamics-Aware Theory of Deep LearningThis project aims to create a robust theoretical framework for deep learning, enhancing understanding and practical tools to improve model performance and reduce complexity in various applications. | ERC Starting... | € 1.498.410 | 2022 | Details |
Modern Challenges in Learning TheoryThis project aims to develop a new theory of generalization in machine learning that better models real-world tasks and addresses data efficiency and privacy challenges. | ERC Starting... | € 1.433.750 | 2022 | Details |
Control for Deep and Federated LearningCoDeFeL aims to enhance machine learning methods through control theory, developing efficient ResNet architectures and federated learning techniques for applications in digital medicine and recommendations. | ERC Advanced... | € 2.499.224 | 2024 | Details |
Optimizing for Generalization in Machine Learning
This project aims to unravel the mystery of generalization in machine learning by developing novel optimization algorithms to enhance the reliability and applicability of ML in critical domains.
Controlling Large Language Models
Develop a framework to understand and control large language models, addressing biases and flaws to ensure safe and responsible AI adoption.
Dynamics-Aware Theory of Deep Learning
This project aims to create a robust theoretical framework for deep learning, enhancing understanding and practical tools to improve model performance and reduce complexity in various applications.
Modern Challenges in Learning Theory
This project aims to develop a new theory of generalization in machine learning that better models real-world tasks and addresses data efficiency and privacy challenges.
Control for Deep and Federated Learning
CoDeFeL aims to enhance machine learning methods through control theory, developing efficient ResNet architectures and federated learning techniques for applications in digital medicine and recommendations.