🤖 AI Summary
In medical image segmentation, conventional active learning (AL) relies on expert annotations of ambiguous regions, yet remains hampered by high annotation costs, ill-defined boundary delineation, and substantial cognitive load. To address these challenges, we propose the first language-guided active learning framework, wherein natural language instructions replace manual region annotations. Leveraging in-context learning, our method automatically translates such instructions into executable segmentation programs, enabling end-to-end, language-driven segmentation optimization. The approach integrates natural language processing, program synthesis, and active domain adaptation—minimizing human intervention without compromising performance. Extensive experiments demonstrate that our framework achieves accuracy comparable to or exceeding that of traditional AL methods in active domain adaptation tasks, while reducing estimated annotation time by approximately 80%. This significantly alleviates expert burden and mitigates boundary uncertainty—key bottlenecks in clinical annotation workflows.
📝 Abstract
Although active learning (AL) in segmentation tasks enables experts to annotate selected regions of interest (ROIs) instead of entire images, it remains highly challenging, labor-intensive, and cognitively demanding due to the blurry and ambiguous boundaries commonly observed in medical images. Also, in conventional AL, annotation effort is a function of the ROI- larger regions make the task cognitively easier but incur higher annotation costs, whereas smaller regions demand finer precision and more attention from the expert. In this context, language guidance provides an effective alternative, requiring minimal expert effort while bypassing the cognitively demanding task of precise boundary delineation in segmentation. Towards this goal, we introduce LINGUAL: a framework that receives natural language instructions from an expert, translates them into executable programs through in-context learning, and automatically performs the corresponding sequence of sub-tasks without any human intervention. We demonstrate the effectiveness of LINGUAL in active domain adaptation (ADA) achieving comparable or superior performance to AL baselines while reducing estimated annotation time by approximately 80%.