Libra: Accelerating Socket I/O via Programmable Selective Data Copying

📅 2026-04-30
📈 Citations: 0
Influential: 0
📄 PDF

career value

220K/year
🤖 AI Summary
This work addresses the performance bottleneck of Layer 7 (L7) proxies caused by full data copying between kernel and user space. Existing approaches often compromise POSIX compatibility or require application modifications. To overcome this, the authors propose an OS-level selective-copy framework that preserves standard socket interface compatibility while copying only protocol metadata—such as HTTP headers—to user space for routing decisions, leaving the bulk payload in the kernel for direct forwarding. Implemented in Linux, the framework leverages eBPF to dynamically identify metadata boundaries and coordinates receive and transmit paths to enable payload reuse. Evaluated on unmodified Nginx and HAProxy, the approach achieves up to 4.2× higher plaintext throughput and over 90% reduction in P99 tail latency; when combined with kTLS hardware offload, it yields a 2.0× improvement in encrypted throughput and 65% lower tail latency.
📝 Abstract
Layer-7 (L7) proxies are critical to modern cloud-native systems, yet their performance is increasingly bottlenecked by copying entire payloads across the kernel-user boundary. Existing approaches reduce this overhead but typically sacrifice compatibility with unmodified POSIX applications, introduce new APIs, or require specialized environments. We show that, under conventional OS abstractions, fully eliminating kernel-user copies while preserving standard socket semantics for unmodified proxies is fundamentally impossible. This leads to a practical insight: in common L7 workloads, proxies inspect only small metadata (e.g., HTTP headers) for routing, while forwarding the bulk payload unchanged. Based on this insight, we present Libra, an OS-level selective-copy framework that copies only metadata to the user space and retains the bulk payload in the kernel for forwarding, reducing data movement without breaking compatibility. Libra uses eBPF to identify protocol-specific metadata boundaries and coordinate selective copy and payload reuse across receive and transmit paths, all without modifying the socket API. Implemented in Linux and evaluated with unmodified Nginx and HAProxy, Libra improves plaintext throughput by up to 4.2x and reduces P99 tail latency by over 90%. With hardware-offloaded kTLS, it boosts encrypted throughput by 2.0x and cuts tail latency by 65%.
Problem

Research questions and friction points this paper is trying to address.

Layer-7 proxy
kernel-user data copying
socket I/O
compatibility
performance bottleneck
Innovation

Methods, ideas, or system contributions that make the work stand out.

selective copying
eBPF
kernel-user boundary
L7 proxy
socket I/O acceleration
🔎 Similar Papers
No similar papers found.