Capturing Identity, Change, and the Long Tail in Knowledge Graphs

TRIFECTA aims to develop a semantic database that captures and contextualizes complex historical entities and concepts, enhancing humanities research through advanced language technology and collaboration.

Subsidie
€ 1.998.351
2023

Projectdetails

Introduction

At first blush, entities and concepts such as “Dutch East India Company” or “coffee” may seem straightforward, but in fact, they are complex and multifaceted. The wealth of digital sources presents the massive potential to study these notions at an unprecedented scale. However, current technologies for distant reading are not capable of dealing with this.

Project Goals

TRIFECTA aims to create a database that describes complex entities and concepts and their contexts by combining language and semantic web technology to extract and relate information from different texts over time.

Key Aims

In addition, a key aim of TRIFECTA is to advance the state of the art in these technologies to deal with:

  1. Change over time
  2. Connections to many different narratives

Methodology

Sophisticated knowledge representation methods from the semantic web can mitigate the failing that many language technology methods do not incorporate enough background knowledge to recognize and interpret complex entities and concepts in their historical contexts.

Knowledge Representation

By treating them as rich networks (or graphs) of knowledge that can express change and relationships to different concepts in space and time, semantic databases can handle the complexity needed to make the outputs of language technology tools suited to humanities research.

Use Cases

Via two use cases, I identify a set of core contentious entities and concepts in maritime and food history.

Data-Driven Approach

Next, through a data-driven, iterative approach, I advance beyond the state-of-the-art in natural language technology for the humanities by targeting three key aspects of the recognition and modeling of complex concepts:

  1. Identity
  2. Change
  3. The long tail

Collaboration

I propose a novel peer-evaluation approach in which a team of humanities scholars, computational linguists, and semantic web researchers collaborate closely to create truly hybrid artificial intelligence systems. This collaboration will enable humanities research to scale to big data without losing sight of the contextual complexity.

Financiële details & Tijdlijn

Financiële details

Subsidiebedrag€ 1.998.351
Totale projectbegroting€ 1.998.351

Tijdlijn

Startdatum1-11-2023
Einddatum31-10-2028
Subsidiejaar2023

Partners & Locaties

Projectpartners

  • KONINKLIJKE NEDERLANDSE AKADEMIE VAN WETENSCHAPPEN - KNAWpenvoerder

Land(en)

Netherlands

Vergelijkbare projecten binnen European Research Council

ERC Advanced...

Modelling Text as a Living Object in Cross-Document Context

InterText establishes a comprehensive framework for intertextuality in NLP, enabling efficient cross-document understanding through novel data models and neural representations for diverse applications.

€ 2.499.721
ERC Advanced...

Exploration of Unknown Environments for Digital Twins

The 'explorer' project aims to automate video data capture and labeling in open worlds to facilitate the creation of semantically rich Digital Twins for complex environments using AI-driven methods.

€ 2.476.718
ERC Consolid...

Natural Language Understanding for non-standard languages and dialects

DIALECT aims to enhance Natural Language Understanding by developing algorithms that integrate dialectal variation and reduce bias in data and labels for fairer, more accurate language models.

€ 1.997.815
ERC Consolid...

A Foundation for Empirical Multimodality Research

FOUNDATIONS develops a novel methodology for empirical research on multimodality by creating large, annotated corpora and using AI to analyze human communication across diverse cultural artifacts.

€ 1.999.974
ERC Advanced...

Deep Culture - Living with Difference in the Age of Deep Learning

DEEP CULTURE aims to critically explore the intersection of deep learning and cultural production through an interdisciplinary framework, fostering new methodologies and public engagement.

€ 2.500.000

Vergelijkbare projecten uit andere regelingen

Mkb-innovati...

Graaf IGOR

Triply en Findest ontwikkelen Graaf IGOR, een slimme database die real-time technologische concepten valideert uit diverse bronnen met AI en graafdatabases.

€ 200.000
Mkb-innovati...

Real time knowledge extraction from unstructured big data streams

Dit project ontwikkelt een applicatie voor het structureren van ongestructureerde data uit sociale media om de productiviteit in de agrarische sector te verbeteren via machine learning.

€ 199.307
Mkb-innovati...

Key Opinion-leader Landscape (KOL)

Het project richt zich op het oplossen van de complexe uitdaging van auteur- en affiliatiedisambiguatie in grote datasets, om innovatie in de farmaceutische sector te versnellen.

€ 19.680
Mkb-innovati...

Inzet van computational linguistics voor het vergaren van military intelligence

Dit project onderzoekt de haalbaarheid van computational linguistics voor het vergaren van militaire inlichtingen ter verbetering van veiligheid.

€ 20.000
Mkb-innovati...

Elementa Labs MIT 2022 – Digitaliseren van Lab notebooks van kennisinstanties

Het project digitaliseert oude lab notebooks van universiteiten om waardevolle kennis toegankelijk te maken, waardoor nieuwe onderzoekers fouten kunnen vermijden en efficiënter kunnen experimenteren.

€ 19.200