Improve Large Language Model Systems with User Logs

πŸ“… 2026-02-06
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
This work addresses the challenges of optimizing large language model systems using user logs, which are typically noisy, unstructured, and off-policy. To overcome these issues, the authors propose UNO, a unified framework that first distills raw logs into semi-structured rules and preference pairs via query-feedback clustering to mitigate data heterogeneity. UNO further introduces a cognitive discrepancy quantification mechanism to adaptively filter noise and modularly separate primary experience from reflective experience, enabling modular preference learning. To the best of the authors’ knowledge, UNO is the first framework capable of adaptively extracting high-quality training signals directly from raw logs in a unified manner. Experiments demonstrate that UNO significantly outperforms retrieval-augmented generation (RAG) and memory-based baselines across multiple metrics, achieving state-of-the-art performance with superior efficiency. The code has been publicly released.

Technology Category

Application Category

πŸ“ Abstract
Scaling training data and model parameters has long driven progress in large language models (LLMs), but this paradigm is increasingly constrained by the scarcity of high-quality data and diminishing returns from rising computational costs. As a result, recent work is increasing the focus on continual learning from real-world deployment, where user interaction logs provide a rich source of authentic human feedback and procedural knowledge. However, learning from user logs is challenging due to their unstructured and noisy nature. Vanilla LLM systems often struggle to distinguish useful feedback signals from noisy user behavior, and the disparity between user log collection and model optimization (e.g., the off-policy optimization problem) further strengthens the problem. To this end, we propose UNO (User log-driveN Optimization), a unified framework for improving LLM systems (LLMsys) with user logs. UNO first distills logs into semi-structured rules and preference pairs, then employs query-and-feedback-driven clustering to manage data heterogeneity, and finally quantifies the cognitive gap between the model's prior knowledge and the log data. This assessment guides the LLMsys to adaptively filter out noisy feedback and construct different modules for primary and reflective experiences extracted from user logs, thereby improving future responses. Extensive experiments show that UNO achieves state-of-the-art effectiveness and efficiency, significantly outperforming Retrieval Augmented Generation (RAG) and memory-based baselines. We have open-sourced our code at https://github.com/bebr2/UNO .
Problem

Research questions and friction points this paper is trying to address.

user logs
large language models
continual learning
noisy feedback
off-policy optimization
Innovation

Methods, ideas, or system contributions that make the work stand out.

User Logs
Continual Learning
Cognitive Gap
Preference Distillation
LLM Optimization
πŸ”Ž Similar Papers
No similar papers found.