Learning Isoform Fingerprints to Discover the Molecular Diversity of Life

This project aims to revolutionize proteomics by developing a novel data analysis strategy using deep learning to discover and quantify protein isoforms through their unique multi-dimensional fingerprints (ORIGINs).

Subsidie
€ 1.498.939
2023

Projectdetails

Introduction

Did you know that ~80% of all proteomic data is not utilized? Proteins play a vital role in all biological processes and organisms. We believe that different versions of a single gene product – protein isoforms – shape the molecular diversity of life. However, comprehensive evidence on a protein level is not available.

Current Challenges

Chromatography-coupled tandem mass spectrometry (LC-MS/MS) is the de-facto standard for measuring proteomes, but it is not good at identifying isoforms because at least 80% of the recorded information is never used.

Project Aim

I argue that isoforms leave a deterministic multi-dimensional fingerprint (ORIGINs) representing their physicochemical properties in each proteomic measurement. Therefore, the central aim of this project is to discover and quantify protein isoforms systematically by a novel MS-based proteomics data analysis strategy.

Methodology

  1. Training Deep Neural Networks
    By tapping into the wealth of data the proteomics community has already amassed, I will train deep neural networks that allow the prediction of ORIGINs.

  2. Innovative Data Analysis Strategy
    I will implement an innovative data analysis strategy that utilizes ORIGINs to identify and quantify isoforms.

  3. Application on Research Questions
    I will demonstrate that ORIGINs can be used to substantially broaden our understanding of the molecular diversity of life by showcasing its application on four emerging and challenging questions in proteome research of varying biological and technical complexity.

Fundamental Questions

This will allow me to address a fundamental open question in biology: to what extent and prevalence are isoforms actually translated, and what functional roles might they be associated with?

Expected Outcomes

ORIGINs will improve the sensitivity, biological resolution, and accuracy at which proteins and their isoforms can be identified and quantified. Beyond this, the concept of ORIGINs can be applied to and improve any proteomics experiments, thus holding the potential to revolutionize MS-based proteomics as a technology and elevate the whole field of protein-based research.

Financiële details & Tijdlijn

Financiële details

Subsidiebedrag€ 1.498.939
Totale projectbegroting€ 1.498.939

Tijdlijn

Startdatum1-6-2023
Einddatum31-5-2028
Subsidiejaar2023

Partners & Locaties

Projectpartners

  • TECHNISCHE UNIVERSITAET MUENCHENpenvoerder

Land(en)

Germany

Vergelijkbare projecten binnen European Research Council

ERC Consolid...

Explainable Machine Learning for Identifying the Full Heterogeneity of Peptidoforms and Proteoforms

explAInProt aims to enhance proteomics by developing explainable, end-to-end machine learning models to identify undetected protein variants and improve clinical applications through advanced sequencing methods.

€ 1.992.500
ERC Consolid...

Proteome diversification in evolution

PROMISE aims to decode protein sequences and structures using AI to understand their interactions and evolution, ultimately transforming big data into actionable biological insights.

€ 1.952.762
ERC Advanced...

A Native Mass Spectrometry Systemic View of Cellular Structural Biology

This project aims to enhance native mass spectrometry for studying protein interactions and diversity in their natural cellular environments, advancing structural biology and related fields.

€ 2.954.167
ERC Proof of...

Precise, Rapid and Scalable Proteomics Solutions for Archaeology, Ecology, Wildlife Forensics and Food-chain Authentication

The PReciSe project aims to develop a fast, cost-effective proteomics method for taxonomic identification to enhance archaeological, ecological, and food supply chain verification.

€ 150.000
ERC Consolid...

Understanding the Language of Life: Identifying and Characterizing the Language Units in Protein Sequences

This project aims to decipher the "language of life" by developing methods to identify protein vocabulary and functions, paving the way for advancements in health and disease treatment.

€ 1.982.800

Vergelijkbare projecten uit andere regelingen

Mkb-innovati...

Haalbaarheidsonderzoek analyse-apparaat voor volledige en gevouwen eiwitten

Portal Biotech ontwikkelt een innovatieve nanopore-technologie voor het meten van volledige eiwitten, met als doel de diagnostiek te revolutioneren en klinische beslissingen te verbeteren.

€ 20.000
Mkb-innovati...

Haalbaarheidsonderzoek: portable nanopore device voor de identificatie van eiwitten en biomarkers.

Portal Biotech ontwikkelt een draagbaar analysetoestel op basis van nanopore technologie om real-time eiwitmetingen mogelijk te maken, wat de diagnostiek revolutionair verandert.

€ 20.000
EIC Accelerator

First time ultra-sensitive and simultaneous quantification of proteins, interactions, and post-translational modifications in single cells to enable exponential growth in proteomics and interactomics

PICO-NGS aims to revolutionize proteomics by enabling ultra-sensitive, high-parallel measurement of proteins, interactions, and modifications, accelerating advancements in various industries.

€ 2.498.125
EIC Pathfinder

Computation driven development of novel vivo-like-DNA-nanotransducers for biomolecules structure identification

This project aims to develop DNA-nanotransducers for real-time detection and analysis of conformational changes in biomolecules, enhancing understanding of molecular dynamics and aiding drug discovery.

€ 3.000.418
EIC Pathfinder

Instrument-free 3D molecular imaging with the VOLumetric UMI-Network EXplorer

VOLUMINEX aims to revolutionize molecular imaging by providing an affordable 3D sequencing-based microscopy method for comprehensive spatial and transcriptomic data mapping.

€ 2.999.999