SHARE: Social-Humanities AI for Research and Education

📅 2026-04-13

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

This work addresses the absence of domain-specific pretrained language models for the social sciences and humanities (SSH), where general-purpose models often fail to capture disciplinary nuances and scholarly conventions. To bridge this gap, the authors introduce SHARE—a family of causal language models pretrained from scratch specifically for SSH—and accompany it with MIRROR, a non-generative interactive interface that enables users to critically review and engage with existing text without producing new content. The study also constructs a tailored Cloze evaluation benchmark for SSH, on which SHARE achieves performance comparable to Phi-4—a general model trained on over 100 times more tokens—demonstrating its efficiency and strong domain alignment. This approach establishes a novel generative AI interaction paradigm that adheres to SSH scholarly principles.

Technology Category

Application Category

📝 Abstract

This intermediate technical report introduces the SHARE family of base models and the MIRROR user interface. The SHARE models are the first causal language models fully pretrained by and for the social sciences and humanities (SSH). Their performance in modelling SSH texts is close to that of general purpose models (Phi-4) which use 100 times more tokens, as shown by our custom SSH Cloze benchmark. The MIRROR user interface is designed for reviewing text inputs from the SSH disciplines while preserving critical engagement. By prototyping a generative AI interface that does not generate any text, we propose a way to harness the capabilities of the SHARE models without compromising the integrity of SSH principles and norms.

Problem

Research questions and friction points this paper is trying to address.

social sciences and humanities

generative AI

critical engagement

AI interface

academic integrity

Innovation

Methods, ideas, or system contributions that make the work stand out.

causal language models

social sciences and humanities

domain-specific pretraining

non-generative interface

critical engagement

🔎 Similar Papers

No similar papers found.

Authors to Follow