The Promises and Perils of using LLMs for Effective Public Services

📅 2026-01-21

📈 Citations: 0

✨ Influential: 0

career value

207K/year

🤖 AI Summary

This study investigates the safe and effective deployment of large language models (LLMs) to support decision-making in high-stakes public services, specifically child welfare, where misjudgments or oversights of cases requiring professional expertise can have severe consequences. In collaboration with a major Canadian child welfare agency, the research integrates LocalLLM with BERTopic to analyze case trajectories and identify deviations from standard procedures. Findings indicate that the model successfully detects procedural omissions but exhibits significant blind spots in complex scenarios demanding nuanced social work judgment. The work underscores the necessity of participatory design approaches to co-develop language-based tools aligned with public sector needs and delineates the current limitations and future directions for AI systems in high-risk decision contexts.

Technology Category

Application Category

📝 Abstract

Governments are the primary providers of essential public services and are responsible for delivering them effectively. In high-stakes decision-making domains such as child welfare (CW), agencies must protect children without unnecessarily prolonging a family's engagement with the system. With growing optimism around AI, governments are pushing for its integration but concerns regarding feasibility and harms remain. Through collaborations with a large Canadian CW agency, we examined how LocalLLM and BERTopic models can track CW case progress. We demonstrate how the tools can potentially assist workers in opportunistically addressing gaps in their work by signaling case progress/deviations. And yet, we also show how they fail to detect case trajectories that require discretionary judgments grounded in social work training, areas where practitioners would actually want support to pre-emptively address substantive case concerns. We also provide a roadmap of future participatory directions to co-design language tools for/with the public sector.

Problem

Research questions and friction points this paper is trying to address.

Large Language Models

Public Services

Child Welfare

AI in Government

Discretionary Judgment

Innovation

Methods, ideas, or system contributions that make the work stand out.

LocalLLM

BERTopic

child welfare