ICE-ID: A Novel Historical Census Data Benchmark Comparing NARS against LLMs, &a ML Ensemble on Longitudinal Identity Resolution

📅 2025-06-11
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This paper addresses the longitudinal identity resolution challenge in Iceland’s 220-year (1703–1920) historical census data, where name variation, generational turnover, and complex kinship structures impede consistent cross-temporal individual identification. To this end, we introduce ICE-ID—the first long-term, longitudinal identity resolution benchmark for historical population data. Methodologically, we pioneer the application of the non-axiomatic reasoning system (NARS) to tabular historical record linkage, integrating NAL-based logical modeling, LLM-augmented models (e.g., TabTransformer), and ensemble learning. We further propose a standardized cross-wave matching task and evaluation protocol. Experiments demonstrate that NARS achieves state-of-the-art performance, significantly outperforming conventional machine learning and large language model–based baselines. The ICE-ID dataset and source code are publicly released, establishing a reproducible benchmark and a novel paradigm for historical demography, digital humanities, and temporal entity resolution.

Technology Category

Application Category

📝 Abstract
We introduce ICE-ID, a novel benchmark dataset for historical identity resolution, comprising 220 years (1703-1920) of Icelandic census records. ICE-ID spans multiple generations of longitudinal data, capturing name variations, demographic changes, and rich genealogical links. To the best of our knowledge, this is the first large-scale, open tabular dataset specifically designed to study long-term person-entity matching in a real-world population. We define identity resolution tasks (within and across census waves) with clearly documented metrics and splits. We evaluate a range of methods: handcrafted rule-based matchers, a ML ensemble as well as LLMs for structured data (e.g. transformer-based tabular networks) against a novel approach to tabular data called NARS (Non-Axiomatic Reasoning System) - a general-purpose AI framework designed to reason with limited knowledge and resources. Its core is Non-Axiomatic Logic (NAL), a term-based logic. Our experiments show that NARS is suprisingly simple and competitive with other standard approaches, achieving SOTA at our task. By releasing ICE-ID and our code, we enable reproducible benchmarking of identity resolution approaches in longitudinal settings and hope that ICE-ID opens new avenues for cross-disciplinary research in data linkage and historical analytics.
Problem

Research questions and friction points this paper is trying to address.

Benchmarking identity resolution in historical census data
Comparing NARS, ML ensemble, and LLMs for person-entity matching
Enabling reproducible research in longitudinal data linkage
Innovation

Methods, ideas, or system contributions that make the work stand out.

ICE-ID benchmark dataset for historical identity resolution
Evaluates ML ensemble and LLMs against NARS
NARS achieves SOTA in identity resolution tasks
G
Gonçalo Hora de Carvalho
IIIM, Iceland
L
Lazar S. Popov
IIIM, Iceland
S
Sander Kaatee
IIIM, Iceland
Kristinn R. Thórisson
Kristinn R. Thórisson
Department of Computer Science Reykjavik University
T
Tangrui Li
Temple University
P
Pétur Húni Björnsson
Department of Nordic Studies and Linguistics University of Copenhagen
Jilles S. Dibangoye
Jilles S. Dibangoye
Associate Professor at University of Groningen
artificial intelligencereinforcement learninggame theory