SciIntBench: Measuring LLM Compliance with Research Integrity Norms Under Adversarial Framing

📅 2026-05-28

📈 Citations: 0

✨ Influential: 0

career value

195K/year

🤖 AI Summary

This study addresses the unclear capacity of large language models (LLMs) to ethically respond to inappropriate requests in scientific research contexts, particularly regarding covert violations of research integrity. The authors introduce SciIntBench, a novel benchmark that integrates adversarial prompting with established research integrity norms across three disciplines and ten integrity scenarios, comprising 810 prompts—each available in explicit adversarial, implicit adversarial, and benign variants. Through fine-grained analysis of 12,960 responses from 16 mainstream LLMs, the study reveals that while models generally reject overt academic misconduct, their ability to detect subtle violations—such as those driven by perceived pressure to take shortcuts—significantly declines. Notably, performance is weakest in nuanced areas like transparency, plagiarism, and data fabrication, exposing critical gaps in current models’ sensitivity to research ethics.

📝 Abstract

Large language models (LLMs) are increasingly used to support scientific work, but it is unclear whether they uphold responsible conduct of research (RCR) norms or help undermine them. We introduce SciIntBench, an adversarial benchmark of 810 prompts across ten RCR categories and three scientific domains. Each scenario appears as an Overt Adversarial, Covert Adversarial, and Benign version, allowing us to jointly measure framing-sensitive refusal of misconduct and helpfulness on legitimate requests. We evaluate 16 commercial and open-weight LLMs from six providers (2024--2026), producing 12,960 responses. We find that scientific integrity alignment is strongly framing-sensitive: models refuse explicit misconduct far more reliably than covert violations, especially failing when misconduct is presented as a pressure-driven shortcut. Refusals vary by RCR category, with weaker boundaries around transparency, plagiarism, and fabrication.

Problem

Research questions and friction points this paper is trying to address.

research integrity

large language models

adversarial framing

responsible conduct of research

scientific misconduct

Innovation

Methods, ideas, or system contributions that make the work stand out.

adversarial benchmarking

research integrity

framing sensitivity