Argo: Efficient Importance Labeling for Enterprise Email Systems

📅 2026-05-20
📈 Citations: 0
Influential: 0
📄 PDF

career value

187K/year
🤖 AI Summary
This work addresses the challenges of enterprise email importance labeling, where traditional methods suffer from poor generalization and large language models incur prohibitive inference costs. To overcome these limitations, the authors propose Argo, an efficient labeling framework that innovatively integrates a low-cost proxy labeling strategy, context-aware modeling, and a dynamic load-aware elastic inference scheduling mechanism. This approach substantially reduces computational overhead while preserving annotation quality. Experimental results on three open-source email datasets demonstrate that Argo achieves a 148–167× reduction in inference cost and a 20–640,000× decrease in profiling cost, with negligible quality degradation, thereby significantly lowering the barrier to enterprise-scale deployment.
📝 Abstract
Email importance labeling has long been a critical yet challenging problem for businesses and individuals. Traditional approaches; such as keyword matching, user-defined rules, and sender-based heuristics; demand extensive manual feature engineering and fail to scale effectively or generalize. Recent advances in large language models (LLMs) demonstrate strong potential and a natural fit for this task, offering deep contextual understanding and superior labeling quality. However, using LLM models like GPT-4.1 at enterprise email volumes incurs prohibitive computational costs and hinders real-world deployment. We explore the trade-off space of using alternative labeling schemes as opposed to GPT4.1 scale LLMs, with the goal of achieving near GPT level labeling quality with significantly lower cost. We develop Argo, an enterprise email labeling framework, where we construct a profiler to efficiently search the cost quality trade-off space of labeling and identify cost-efficient alternatives to labeling emails. Additionally, we design an on-demand provisioning scheme to intelligently scale Argo with real time load, to minimize cost increases during peak load inference. Over 3 open-source email datasets, Argo achieves 148-167X inference cost reduction with negligible quality degradation and 20-640000X lower profiling costs, making large-scale, context-aware email labeling practical for enterprises.
Problem

Research questions and friction points this paper is trying to address.

email importance labeling
large language models
cost efficiency
enterprise email systems
inference cost
Innovation

Methods, ideas, or system contributions that make the work stand out.

cost-quality trade-off
efficient LLM inference
email importance labeling
on-demand provisioning
profiling
🔎 Similar Papers
No similar papers found.