Capturing Identity, Change, and the Long Tail in Knowledge Graphs
TRIFECTA aims to develop a semantic database that captures and contextualizes complex historical entities and concepts, enhancing humanities research through advanced language technology and collaboration.
Projectdetails
Introduction
At first blush, entities and concepts such as “Dutch East India Company” or “coffee” may seem straightforward, but in fact, they are complex and multifaceted. The wealth of digital sources presents the massive potential to study these notions at an unprecedented scale. However, current technologies for distant reading are not capable of dealing with this.
Project Goals
TRIFECTA aims to create a database that describes complex entities and concepts and their contexts by combining language and semantic web technology to extract and relate information from different texts over time.
Key Aims
In addition, a key aim of TRIFECTA is to advance the state of the art in these technologies to deal with:
- Change over time
- Connections to many different narratives
Methodology
Sophisticated knowledge representation methods from the semantic web can mitigate the failing that many language technology methods do not incorporate enough background knowledge to recognize and interpret complex entities and concepts in their historical contexts.
Knowledge Representation
By treating them as rich networks (or graphs) of knowledge that can express change and relationships to different concepts in space and time, semantic databases can handle the complexity needed to make the outputs of language technology tools suited to humanities research.
Use Cases
Via two use cases, I identify a set of core contentious entities and concepts in maritime and food history.
Data-Driven Approach
Next, through a data-driven, iterative approach, I advance beyond the state-of-the-art in natural language technology for the humanities by targeting three key aspects of the recognition and modeling of complex concepts:
- Identity
- Change
- The long tail
Collaboration
I propose a novel peer-evaluation approach in which a team of humanities scholars, computational linguists, and semantic web researchers collaborate closely to create truly hybrid artificial intelligence systems. This collaboration will enable humanities research to scale to big data without losing sight of the contextual complexity.
Financiële details & Tijdlijn
Financiële details
Subsidiebedrag | € 1.998.351 |
Totale projectbegroting | € 1.998.351 |
Tijdlijn
Startdatum | 1-11-2023 |
Einddatum | 31-10-2028 |
Subsidiejaar | 2023 |
Partners & Locaties
Projectpartners
- KONINKLIJKE NEDERLANDSE AKADEMIE VAN WETENSCHAPPEN - KNAWpenvoerder
Land(en)
Vergelijkbare projecten binnen European Research Council
Project | Regeling | Bedrag | Jaar | Actie |
---|---|---|---|---|
Modelling Text as a Living Object in Cross-Document ContextInterText establishes a comprehensive framework for intertextuality in NLP, enabling efficient cross-document understanding through novel data models and neural representations for diverse applications. | ERC Advanced... | € 2.499.721 | 2023 | Details |
Exploration of Unknown Environments for Digital TwinsThe 'explorer' project aims to automate video data capture and labeling in open worlds to facilitate the creation of semantically rich Digital Twins for complex environments using AI-driven methods. | ERC Advanced... | € 2.476.718 | 2023 | Details |
Natural Language Understanding for non-standard languages and dialectsDIALECT aims to enhance Natural Language Understanding by developing algorithms that integrate dialectal variation and reduce bias in data and labels for fairer, more accurate language models. | ERC Consolid... | € 1.997.815 | 2022 | Details |
A Foundation for Empirical Multimodality ResearchFOUNDATIONS develops a novel methodology for empirical research on multimodality by creating large, annotated corpora and using AI to analyze human communication across diverse cultural artifacts. | ERC Consolid... | € 1.999.974 | 2024 | Details |
Deep Culture - Living with Difference in the Age of Deep LearningDEEP CULTURE aims to critically explore the intersection of deep learning and cultural production through an interdisciplinary framework, fostering new methodologies and public engagement. | ERC Advanced... | € 2.500.000 | 2024 | Details |
Modelling Text as a Living Object in Cross-Document Context
InterText establishes a comprehensive framework for intertextuality in NLP, enabling efficient cross-document understanding through novel data models and neural representations for diverse applications.
Exploration of Unknown Environments for Digital Twins
The 'explorer' project aims to automate video data capture and labeling in open worlds to facilitate the creation of semantically rich Digital Twins for complex environments using AI-driven methods.
Natural Language Understanding for non-standard languages and dialects
DIALECT aims to enhance Natural Language Understanding by developing algorithms that integrate dialectal variation and reduce bias in data and labels for fairer, more accurate language models.
A Foundation for Empirical Multimodality Research
FOUNDATIONS develops a novel methodology for empirical research on multimodality by creating large, annotated corpora and using AI to analyze human communication across diverse cultural artifacts.
Deep Culture - Living with Difference in the Age of Deep Learning
DEEP CULTURE aims to critically explore the intersection of deep learning and cultural production through an interdisciplinary framework, fostering new methodologies and public engagement.
Vergelijkbare projecten uit andere regelingen
Project | Regeling | Bedrag | Jaar | Actie |
---|---|---|---|---|
Graaf IGORTriply en Findest ontwikkelen Graaf IGOR, een slimme database die real-time technologische concepten valideert uit diverse bronnen met AI en graafdatabases. | Mkb-innovati... | € 200.000 | 2020 | Details |
Real time knowledge extraction from unstructured big data streamsDit project ontwikkelt een applicatie voor het structureren van ongestructureerde data uit sociale media om de productiviteit in de agrarische sector te verbeteren via machine learning. | Mkb-innovati... | € 199.307 | 2017 | Details |
Key Opinion-leader Landscape (KOL)Het project richt zich op het oplossen van de complexe uitdaging van auteur- en affiliatiedisambiguatie in grote datasets, om innovatie in de farmaceutische sector te versnellen. | Mkb-innovati... | € 19.680 | 2020 | Details |
Inzet van computational linguistics voor het vergaren van military intelligenceDit project onderzoekt de haalbaarheid van computational linguistics voor het vergaren van militaire inlichtingen ter verbetering van veiligheid. | Mkb-innovati... | € 20.000 | 2023 | Details |
Elementa Labs MIT 2022 – Digitaliseren van Lab notebooks van kennisinstantiesHet project digitaliseert oude lab notebooks van universiteiten om waardevolle kennis toegankelijk te maken, waardoor nieuwe onderzoekers fouten kunnen vermijden en efficiënter kunnen experimenteren. | Mkb-innovati... | € 19.200 | 2022 | Details |
Graaf IGOR
Triply en Findest ontwikkelen Graaf IGOR, een slimme database die real-time technologische concepten valideert uit diverse bronnen met AI en graafdatabases.
Real time knowledge extraction from unstructured big data streams
Dit project ontwikkelt een applicatie voor het structureren van ongestructureerde data uit sociale media om de productiviteit in de agrarische sector te verbeteren via machine learning.
Key Opinion-leader Landscape (KOL)
Het project richt zich op het oplossen van de complexe uitdaging van auteur- en affiliatiedisambiguatie in grote datasets, om innovatie in de farmaceutische sector te versnellen.
Inzet van computational linguistics voor het vergaren van military intelligence
Dit project onderzoekt de haalbaarheid van computational linguistics voor het vergaren van militaire inlichtingen ter verbetering van veiligheid.
Elementa Labs MIT 2022 – Digitaliseren van Lab notebooks van kennisinstanties
Het project digitaliseert oude lab notebooks van universiteiten om waardevolle kennis toegankelijk te maken, waardoor nieuwe onderzoekers fouten kunnen vermijden en efficiënter kunnen experimenteren.