π€ AI Summary
To address the challenge of simultaneously achieving high throughput, low latency, and optimal rate-distortion (RD) performance in real-time ultra-high-definition (UHD) video transcoding, this paper presents a systematic evaluation of NVIDIAβs Split-Frame Encoding (SFE) technology for 4K/8K hardware-accelerated encoding. We propose an SFE architecture leveraging parallel encoding of a single frame across multiple NVENC cores and conduct comprehensive experiments using standardized test sequences to jointly assess RD performance, power consumption, and end-to-end latency. Results show that SFE nearly doubles encoding throughput, incurs no additional latency for 4K and significantly reduces end-to-end latency for 8K, while maintaining negligible RD degradation (average BD-rate increase <0.5%). Crucially, it enables real-time 8K encoding under high-quality presets. This work provides the first quantitative validation of SFEβs synergistic gains in energy efficiency and real-time capability, establishing a practical hardware-acceleration paradigm for UHD cloud transcoding and edge video processing.
π Abstract
NVIDIA Encoder (NVENC) features in modern NVIDIA GPUs, offer significant advantages over software encoders by providing comparable Rate-Distortion (RD) performance while consuming considerably less power. The increasing capability of consumer devices to capture footage in Ultra High-Definition (UHD) at 4K and 8K resolutions necessitates high-performance video transcoders for internet-based delivery. To address this demand, NVIDIA introduced Split-Frame Encoding (SFE), a technique that leverages multiple on-die NVENC chips available in high-end GPUs. SFE splits a single UHD frame for parallel encoding across these physical encoders and subsequently stitches the results, which significantly improves encoding throughput. However, this approach is known to incur an RD performance penalty. The widespread adoption of NVIDIA GPUs in data centers, driven by the rise of Generative AI, means NVENC is poised to play a critical role in transcoding UHD video. To better understand the performance-efficiency tradeoff of SFE, this paper evaluates SFE's impact on RD performance, encoding throughput, power consumption, and end-to-end latency using standardized test sequences. The results show that for real-time applications, SFE nearly doubles encoding throughput with a negligible RD performance penalty, which enables the use of higher-quality presets for 4K and makes real-time 8K encoding feasible, effectively offsetting the minor RD penalty. Moreover, SFE adds no latency at 4K and can reduce it at 8K, positioning it as a key enabler for high-throughput, real-time UHD transcoding.