Trust-Aware Routing for Distributed Generative AI Inference at the Edge

📅 2026-03-30

📈 Citations: 0

✨ Influential: 0

career value

206K/year

🤖 AI Summary

This work addresses the vulnerability of generative AI inference in distributed edge environments to device failures and untrustworthy nodes, a challenge inadequately mitigated by conventional routing mechanisms. To enable reliable collaborative inference, the paper introduces G-TRAC, a novel framework that integrates trust-aware routing into generative AI inference for the first time. G-TRAC features a polynomial-time, risk-bounded shortest path algorithm, complemented by a lightweight hybrid trust architecture and a background synchronization mechanism, jointly optimizing trustworthiness, performance, and reliability. Experimental evaluation on a heterogeneous edge testbed demonstrates that the proposed approach significantly improves inference completion rates, effectively isolates unreliable nodes, and maintains robust execution even under node failures and network partitions.

Technology Category

Application Category

📝 Abstract

Emerging deployments of Generative AI increasingly execute inference across decentralized and heterogeneous edge devices rather than on a single trusted server. In such environments, a single device failure or misbehavior can disrupt the entire inference process, making traditional best-effort peer-to-peer routing insufficient. Coordinating distributed generative inference therefore requires mechanisms that explicitly account for reliability, performance variability, and trust among participating peers. In this paper, we present G-TRAC, a trust-aware coordination framework that integrates algorithmic path selection with system-level protocol design to ensure robust distributed inference. First, we formulate the routing problem as a \textit{Risk-Bounded Shortest Path} computation and introduce a polynomial-time solution that combines trust-floor pruning with Dijkstra's search, achieving sub-millisecond median routing latency at practical edge scales, and remaining below 10 ms at larger scales. Second, to operationally support the routing logic in dynamic environments, the framework employs a \textit{Hybrid Trust Architecture} that maintains global reputation state at stable anchors while disseminating lightweight updates to edge peers via background synchronization. Experimental evaluation on a heterogeneous testbed of commodity devices demonstrates that G-TRAC significantly improves inference completion rates, effectively isolates unreliable peers, and sustains robust execution even under node failures and network partitions.

Problem

Research questions and friction points this paper is trying to address.

Trust-Aware Routing

Distributed Generative AI

Edge Inference

Reliability

Peer-to-Peer Coordination

Innovation

Methods, ideas, or system contributions that make the work stand out.

Trust-aware routing

Risk-Bounded Shortest Path

Hybrid Trust Architecture