End-to-End Automated Logging via Multi-Agent Framework

📅 2025-11-23

📈 Citations: 0

✨ Influential: 0

career value

183K/year

🤖 AI Summary

Software logging faces a dual challenge: excessive logging incurs high operational costs, while insufficient logging introduces significant debugging and monitoring risks. Existing tools lack principled support for the fundamental “whether-to-log” decision and fail to model the multi-stage, compositional nature of log generation. This paper proposes AutoLogger—a novel hybrid multi-agent framework that comprehensively covers the full logging decision chain: *judgment* (whether to log), *localization* (where to insert), and *generation* (what to log). AutoLogger integrates static program analysis, retrieval-augmented reasoning, fine-tuned binary classifiers, and LLM-based evaluation mechanisms. A dedicated judgment model accurately identifies logging necessity; a localization agent and a generation agent jointly perform context-aware log injection. Evaluated on three open-source projects, AutoLogger achieves 96.63% F1-score in logging necessity classification and improves end-to-end log quality—assessed via LLM-as-a-judge—by 16.13% over the strongest baseline, with compatibility across diverse LLM backbones.

Technology Category

Application Category

📝 Abstract

Software logging is critical for system observability, yet developers face a dual crisis of costly overlogging and risky underlogging. Existing automated logging tools often overlook the fundamental whether-to-log decision and struggle with the composite nature of logging. In this paper, we propose Autologger, a novel hybrid framework that addresses the complete the end-to-end logging pipeline. Autologger first employs a fine-tuned classifier, the Judger, to accurately determine if a method requires new logging statements. If logging is needed, a multi-agent system is activated. The system includes specialized agents: a Locator dedicated to determining where to log, and a Generator focused on what to log. These agents work together, utilizing our designed program analysis and retrieval tools. We evaluate Autologger on a large corpus from three mature open-source projects against state-of-the-art baselines. Our results show that Autologger achieves 96.63% F1-score on the crucial whether-to-log decision. In an end-to-end setting, Autologger improves the overall quality of generated logging statements by 16.13% over the strongest baseline, as measured by an LLM-as-a-judge score. We also demonstrate that our framework is generalizable, consistently boosting the performance of various backbone LLMs.

Problem

Research questions and friction points this paper is trying to address.

Automates the complete end-to-end software logging pipeline

Determines whether to log methods and generates appropriate statements

Addresses costly overlogging and risky underlogging in development

Innovation

Methods, ideas, or system contributions that make the work stand out.

Hybrid framework automates complete logging pipeline

Multi-agent system locates and generates logging statements

Fine-tuned classifier determines whether-to-log requirement

🔎 Similar Papers

System for systematic literature review using multiple AI agents: Concept and an empirical evaluation