🤖 AI Summary
Existing inverse rendering methods rely on predefined, non-learnable importance samplers, which struggle to adapt to spatially and directionally varying integrands—resulting in high variance and suboptimal reconstructions. To address this, we propose the first spatial- and directional-aware learnable importance sampler. Our method employs a tensor neural representation to jointly model scene geometry and reflectance properties, and leverages normalizing flows to model and sample the incident light probability density function (PDF) conditioned on both position and surface normal. This is the first work to integrate tensor representations with normalizing flows for end-to-end alignment with the rendering equation’s integrand, breaking away from conventional fixed-sampling paradigms. Evaluated on both synthetic and real-world datasets, our approach significantly reduces rendering variance and improves accuracy in geometry, material, and lighting reconstruction—consistently outperforming state-of-the-art methods.
📝 Abstract
Inverse rendering aims to recover scene geometry, material properties, and lighting from multi-view images. Given the complexity of light-surface interactions, importance sampling is essential for the evaluation of the rendering equation, as it reduces variance and enhances the efficiency of Monte Carlo sampling. Existing inverse rendering methods typically use pre-defined non-learnable importance samplers in prior manually, struggling to effectively match the spatially and directionally varied integrand and resulting in high variance and suboptimal performance. To address this limitation, we propose the concept of learning a spatially and directionally aware importance sampler for the rendering equation to accurately and flexibly capture the unconstrained complexity of a typical scene. We further formulate TensoFlow, a generic approach for sampler learning in inverse rendering, enabling to closely match the integrand of the rendering equation spatially and directionally. Concretely, our sampler is parameterized by normalizing flows, allowing both directional sampling of incident light and probability density function (PDF) inference. To capture the characteristics of the sampler spatially, we learn a tensorial representation of the scene space, which imposes spatial conditions, together with reflected direction, leading to spatially and directionally aware sampling distributions. Our model can be optimized by minimizing the difference between the integrand and our normalizing flow. Extensive experiments validate the superiority of TensoFlow over prior alternatives on both synthetic and real-world benchmarks.