Scalable and Efficient Large-Scale Log Analysis with LLMs: An IT Software Support Case Study

📅 2025-11-17
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the infeasibility of manual analysis for large-scale IT system logs, this paper proposes a lightweight log analysis framework leveraging large language models (LLMs). The method introduces a CPU-efficient inference mechanism that significantly improves LLM throughput on resource-constrained hardware without compromising semantic understanding fidelity. It integrates log parsing, contextual modeling, and fault-oriented semantic reasoning to enable end-to-end automated diagnosis. Deployed in production, the system supports 70 software products and has processed over 2,000 incident tickets. Empirical evaluation demonstrates an average monthly reduction of more than 300 human labor hours compared to conventional approaches—equivalent to approximately USD 15,444 in cost savings. The framework thus advances practical, scalable, and cost-effective LLM-based log analytics for real-world operational environments.

Technology Category

Application Category

📝 Abstract
IT environments typically have logging mechanisms to monitor system health and detect issues. However, the huge volume of generated logs makes manual inspection impractical, highlighting the importance of automated log analysis in IT Software Support. In this paper, we propose a log analytics tool that leverages Large Language Models (LLMs) for log data processing and issue diagnosis, enabling the generation of automated insights and summaries. We further present a novel approach for efficiently running LLMs on CPUs to process massive log volumes in minimal time without compromising output quality. We share the insights and lessons learned from deployment of the tool - in production since March 2024 - scaled across 70 software products, processing over 2000 tickets for issue diagnosis, achieving a time savings of 300+ man hours and an estimated $15,444 per month in manpower costs compared to the traditional log analysis practices.
Problem

Research questions and friction points this paper is trying to address.

Automating log analysis for massive IT system logs
Processing large log volumes efficiently using LLMs on CPUs
Reducing manual effort in IT issue diagnosis and support
Innovation

Methods, ideas, or system contributions that make the work stand out.

Leveraging LLMs for automated log analysis
Efficiently running LLMs on CPUs
Processing massive log volumes rapidly
🔎 Similar Papers
No similar papers found.
Pranjal Gupta
Pranjal Gupta
IBM Research, India
K
Karan Bhukar
Amazon, India
Harshit Kumar
Harshit Kumar
Whiterabbit.ai, Inc.
Deep LearningSecurityHardware Security and Trust
Seema Nagar
Seema Nagar
IBM Research, India
P
P. Mohapatra
IBM Research, India
D
Debanjana Kar
IBM Research, India