RegGuard: AI-Powered Retrieval-Enhanced Assistant for Pharmaceutical Regulatory Compliance

πŸ“… 2026-01-25
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
This work proposes an AI assistant tailored for pharmaceutical regulatory compliance, addressing challenges posed by frequently updated, heterogeneously formatted, and cross-jurisdictionally complex regulatory requirements that are costly and error-prone to manage manually. The system ingests multi-source regulatory documents through a secure data pipeline and introduces HiSACC, a novel hierarchical semantic chunking method that preserves semantic coherence across non-contiguous text segments. It further enhances retrieval relevance via ReLACE, a domain-adapted, listwise adaptive cross-encoder. Designed for auditability, traceability, and incremental updates, the system significantly improves response relevance, factual accuracy, and contextual focus in enterprise deployment, effectively mitigating hallucination risks and meeting the stringent demands of high-compliance environments.

Technology Category

Application Category

πŸ“ Abstract
The increasing frequency and complexity of regulatory updates present a significant burden for multinational pharmaceutical companies. Compliance teams must interpret evolving rules across jurisdictions, formats, and agencies, often manually, at high cost and risk of error. We introduce RegGuard, an industrial-scale AI assistant designed to automate the interpretation of heterogeneous regulatory texts and align them with internal corporate policies. The system ingests heterogeneous document sources through a secure pipeline and enhances retrieval and generation quality with two novel components: HiSACC (Hierarchical Semantic Aggregation for Contextual Chunking) semantically segments long documents into coherent units while maintaining consistency across non-contiguous sections. ReLACE (Regulatory Listwise Adaptive Cross-Encoder for Reranking), a domain-adapted cross-encoder built on an open-source model, jointly models user queries and retrieved candidates to improve ranking relevance. Evaluations in enterprise settings demonstrate that RegGuard improves answer quality specifically in terms of relevance, groundedness, and contextual focus, while significantly mitigating hallucination risk. The system architecture is built for auditability and traceability, featuring provenance tracking, access control, and incremental indexing, making it highly responsive to evolving document sources and relevant for any domain with stringent compliance demands.
Problem

Research questions and friction points this paper is trying to address.

regulatory compliance
pharmaceutical regulation
heterogeneous regulatory texts
compliance burden
regulatory updates
Innovation

Methods, ideas, or system contributions that make the work stand out.

HiSACC
ReLACE
regulatory compliance
retrieval-augmented generation
hallucination mitigation
πŸ”Ž Similar Papers
No similar papers found.
Siyuan Yang
Siyuan Yang
Wallenberg-NTU Presidential Postdoctoral Fellowship, Nanyang Technological University
Computer VisionAction Recognition
X
Xihan Bian
Xi’an Jiaotong-Liverpool University
J
Jiayin Tang
Roche Diagnostics (Suzhou)