Controlling Large Language Models

Develop a framework to understand and control large language models, addressing biases and flaws to ensure safe and responsible AI adoption.

Subsidie

€ 1.500.000

2024

Projectdetails

Introduction

Large language models (LMs) are quickly becoming the backbone of many artificial intelligence (AI) systems, achieving state-of-the-art results in many tasks and application domains. Despite the rapid progress in the field, AI systems suffer from multiple flaws inherited from the underlying LMs: biased behavior, out-of-date information, confabulations, flawed reasoning, and more.

Understanding and Controlling LMs

If we wish to control these systems, we must first understand how they work and develop mechanisms to intervene, update, and repair them. However, the black-box nature of LMs makes them largely inaccessible to such interventions. In this proposal, our overarching goal is to:

Develop a framework for elucidating the internal mechanisms in LMs and for controlling their behavior in an efficient, interpretable, and safe manner.

Objectives

To achieve this goal, we will work through four objectives:

Dissecting Internal Mechanisms
We will dissect the internal mechanisms of information storage and recall in LMs and develop ways to update and repair such information.
Illuminating Higher-Level Capabilities
We will illuminate the mechanisms of higher-level capabilities of LMs to perform reasoning and simulations. We will also repair problems stemming from alignment steps.
Investigating Training Processes
We will investigate how training processes of LMs affect their emergent mechanisms and develop methods for fine-grained control over the training process.
Establishing a Standard Benchmark
Finally, we will establish a standard benchmark for mechanistic interpretability of LMs to consolidate disparate efforts in the community.

Conclusion

Taken as a whole, we expect the proposed research to empower different stakeholders and ensure a safe, beneficial, and responsible adoption of LMs in AI technologies by our society.

Financiële details & Tijdlijn

Financiële details

Subsidiebedrag	€ 1.500.000
Totale projectbegroting	€ 1.500.000

Tijdlijn

Startdatum	1-11-2024
Einddatum	31-10-2029
Subsidiejaar	2024

Partners & Locaties

Projectpartners

TECHNION - ISRAEL INSTITUTE OF TECHNOLOGYpenvoerder

Land(en)

Israel

Vergelijkbare projecten binnen European Research Council

Project	Regeling	Bedrag	Jaar	Actie
Uniting Statistical Testing and Machine Learning for Safe Predictions The project aims to enhance the interpretability and reliability of machine learning predictions by integrating statistical methods to establish robust error bounds and ensure safe deployment in real-world applications.	ERC Starting...	€ 1.500.000	2024	Details
DEep COgnition Learning for LAnguage GEneration This project aims to enhance NLP models by integrating machine learning, cognitive science, and structured memory to improve out-of-domain generalization and contextual understanding in language generation tasks.	ERC Consolid...	€ 1.999.595	2023	Details
Control for Deep and Federated Learning CoDeFeL aims to enhance machine learning methods through control theory, developing efficient ResNet architectures and federated learning techniques for applications in digital medicine and recommendations.	ERC Advanced...	€ 2.499.224	2024	Details
Towards an Artificial Cognitive Science This project aims to establish a new field of artificial cognitive science by applying cognitive psychology to enhance the learning and decision-making of advanced AI models.	ERC Starting...	€ 1.496.000	2024	Details
Next-Generation Natural Language Generation This project aims to enhance natural language generation by integrating neural models with symbolic representations for better control, adaptability, and reliable evaluation across various applications.	ERC Starting...	€ 1.420.375	2022	Details

ERC Starting...

Uniting Statistical Testing and Machine Learning for Safe Predictions

The project aims to enhance the interpretability and reliability of machine learning predictions by integrating statistical methods to establish robust error bounds and ensure safe deployment in real-world applications.

ERC Starting Grant

€ 1.500.000

2024

Project	Regeling	Bedrag	Jaar	Actie
LLM Innovatie voor Klantenservice software (LLM) Het project onderzoekt de haalbaarheid van een innovatief LLM voor geautomatiseerde klantenservice, gericht op privacy en maatwerk.	Mkb-innovati...	€ 20.000	2024	Details
Project Hominis Het project richt zich op het ontwikkelen van een ethisch AI-systeem voor natuurlijke taalverwerking dat vooroordelen minimaliseert en technische, economische en regelgevingsrisico's beheert.	Mkb-innovati...	€ 20.000	2022	Details
Haalbaarheidsonderzoek naar AIPerLearn (AI-Powered Personalized Learning) STARK Learning onderzoekt de toepassing en training van AI-modellen om het ontwikkelen van gepersonaliseerde lesmaterialen te automatiseren en de kwaliteit en validatie te waarborgen.	Mkb-innovati...	€ 20.000	2023	Details
eXplainable AI in Personalized Mental Healthcare Dit project ontwikkelt een innovatief AI-platform dat gebruikers betrekt bij het verbeteren van algoritmen via feedbackloops, gericht op transparantie en betrouwbaarheid in de geestelijke gezondheidszorg.	Mkb-innovati...	€ 350.000	2022	Details

Projectdetails

Introduction

Understanding and Controlling LMs

Objectives

Conclusion

Financiële details & Tijdlijn

Financiële details

Tijdlijn

Partners & Locaties

Projectpartners

Land(en)

Vergelijkbare projecten binnen European Research Council

Uniting Statistical Testing and Machine Learning for Safe Predictions

DEep COgnition Learning for LAnguage GEneration

Control for Deep and Federated Learning

Towards an Artificial Cognitive Science

Next-Generation Natural Language Generation

Uniting Statistical Testing and Machine Learning for Safe Predictions

DEep COgnition Learning for LAnguage GEneration

Control for Deep and Federated Learning

Towards an Artificial Cognitive Science

Next-Generation Natural Language Generation

Vergelijkbare projecten uit andere regelingen

LLM Innovatie voor Klantenservice software (LLM)

Project Hominis

Haalbaarheidsonderzoek naar AIPerLearn (AI-Powered Personalized Learning)

eXplainable AI in Personalized Mental Healthcare

LLM Innovatie voor Klantenservice software (LLM)

Project Hominis

Haalbaarheidsonderzoek naar AIPerLearn (AI-Powered Personalized Learning)

eXplainable AI in Personalized Mental Healthcare

Projectdetails

Introduction

Understanding and Controlling LMs

Objectives

Conclusion

Financiële details & Tijdlijn

Financiële details

Tijdlijn

Partners & Locaties

Projectpartners

Land(en)

Vergelijkbare projecten binnen European Research Council

Uniting Statistical Testing and Machine Learning for Safe Predictions

DEep COgnition Learning for LAnguage GEneration

Control for Deep and Federated Learning

Towards an Artificial Cognitive Science

Next-Generation Natural Language Generation

Uniting Statistical Testing and Machine Learning for Safe Predictions

DEep COgnition Learning for LAnguage GEneration

Control for Deep and Federated Learning

Towards an Artificial Cognitive Science

Next-Generation Natural Language Generation

Vergelijkbare projecten uit andere regelingen

LLM Innovatie voor Klantenservice software (LLM)

Project Hominis

Haalbaarheidsonderzoek naar AIPerLearn (AI-Powered Personalized Learning)

eXplainable AI in Personalized Mental Healthcare

LLM Innovatie voor Klantenservice software (LLM)

Project Hominis

Haalbaarheidsonderzoek naar AIPerLearn (AI-Powered Personalized Learning)

eXplainable AI in Personalized Mental Healthcare