Learning to Navigate Under Imperfect Perception: Conformalised Segmentation for Safe Reinforcement Learning

📅 2025-10-21

📈 Citations: 0

✨ Influential: 0

career value

215K/year

🤖 AI Summary

In safety-critical navigation, imperfect perception introduces significant risk in hazard avoidance—existing approaches either assume perfect hazard detection or lack rigorous, finite-sample safety guarantees. Method: We propose COPPOL, the first framework to integrate distribution-free, finite-sample safety guarantees directly into the perception module. It unifies conformal prediction with semantic segmentation to produce a calibrated hazard map and a risk-aware cost field with provable upper bounds on missed detections; this field then guides uncertainty-aware reinforcement learning for planning. Results: Evaluated on two satellite remote sensing benchmarks, COPPOL achieves up to 6× improvement in hazardous region coverage and reduces hazard violation rates by ~50%, while maintaining robustness under distributional shift. Its core contribution is a verifiable, end-to-end propagation of perceptual uncertainty into decision-time safety boundaries.

Technology Category

Application Category

📝 Abstract

Reliable navigation in safety-critical environments requires both accurate hazard perception and principled uncertainty handling to strengthen downstream safety handling. Despite the effectiveness of existing approaches, they assume perfect hazard detection capabilities, while uncertainty-aware perception approaches lack finite-sample guarantees. We present COPPOL, a conformal-driven perception-to-policy learning approach that integrates distribution-free, finite-sample safety guarantees into semantic segmentation, yielding calibrated hazard maps with rigorous bounds for missed detections. These maps induce risk-aware cost fields for downstream RL planning. Across two satellite-derived benchmarks, COPPOL increases hazard coverage (up to 6x) compared to comparative baselines, achieving near-complete detection of unsafe regions while reducing hazardous violations during navigation (up to approx 50%). More importantly, our approach remains robust to distributional shift, preserving both safety and efficiency.

Problem

Research questions and friction points this paper is trying to address.

Addresses unreliable hazard perception in safety-critical navigation tasks

Provides finite-sample safety guarantees for semantic segmentation uncertainty

Enables risk-aware reinforcement learning with calibrated hazard detection

Innovation

Methods, ideas, or system contributions that make the work stand out.

Conformal-driven perception-to-policy learning approach

Integrates safety guarantees into semantic segmentation

Generates risk-aware cost fields for RL planning

🔎 Similar Papers

No similar papers found.