Reliable Inference in Edge-Cloud Model Cascades via Conformal Alignment

📅 2025-10-20

📈 Citations: 0

✨ Influential: 0

career value

199K/year

🤖 AI Summary

Edge intelligence faces a fundamental trade-off between low latency and reliability, particularly in edge-cloud cascaded inference, where ensuring conditional coverage—i.e., the probability that the predicted set contains the true label given the cloud model’s output—remains challenging. This work formally defines the conditional coverage problem for cascaded inference and proposes Conformal Alignment, a novel framework that calibrates the edge model to satisfy cloud-conditioned coverage guarantees while framing offloading decisions as a multi-hypothesis test (MHT). The method enables tunable risk–latency trade-offs by dynamically determining whether to offload to the cloud. Experiments on CIFAR-100 and TeleQnA demonstrate that the edge model consistently achieves the target conditional coverage, significantly reduces cloud offloading rates, and maintains controlled growth in prediction set size.

Technology Category

Application Category

📝 Abstract

Edge intelligence enables low-latency inference via compact on-device models, but assuring reliability remains challenging. We study edge-cloud cascades that must preserve conditional coverage: whenever the edge returns a prediction set, it should contain the true label with a user-specified probability, as if produced by the cloud model. We formalize conditional coverage with respect to the cloud predictive distribution, and introduce a conformal alignment-based (CAb) cascading mechanism that certifies this property with user control over the risk level. Our method casts escalation from edge to cloud models as a multiple-hypothesis testing (MHT) problem, tailoring conformal alignment (CA) to select which inputs can be safely handled at the edge. The proposed CAb model cascading method yields statistical guarantees on the average fraction of edge decisions that satisfy cloud-level conditional coverage. The procedure applies to arbitrary edge prediction sets, including variants of conformal prediction (CP), and exposes a tunable trade-off among coverage, deferral rate, and set size. Experiments on CIFAR-100 image classification and the TeleQnA question-answering (QA) benchmark show that the proposed CAb cascade maintains the target conditional coverage for edge predictions while substantially reducing offloading to the cloud and incurring modest increases in prediction-set size.

Problem

Research questions and friction points this paper is trying to address.

Ensuring conditional coverage in edge-cloud cascades

Certifying statistical guarantees via conformal alignment

Optimizing trade-off between coverage and cloud offloading

Innovation

Methods, ideas, or system contributions that make the work stand out.

Conformal alignment certifies conditional coverage guarantees

Multiple hypothesis testing controls edge-to-cloud escalation

Tunable trade-off among coverage deferral and set size

🔎 Similar Papers

Agreement-Based Cascading for Efficient Inference