Common Inpainted Objects In-N-Out of Context

📅 2025-05-31

📈 Citations: 0

✨ Influential: 0

career value

162K/year

🤖 AI Summary

Existing vision datasets lack out-of-context samples, hindering research in context-aware visual understanding and image forensics. Method: We introduce COinCO, the first large-scale, systematically annotated dataset of context-consistent images (97,722 samples), generated by semantically controlled object replacement on COCO using diffusion models and fine-grained plausibility annotation via multimodal large language models. We propose the novel “Objects-from-Context” prediction task and develop a context-enhanced zero-shot forgery detection framework—requiring no fine-tuning—by integrating diffusion-based inpainting, multimodal plausibility assessment, semantic prior modeling, and context-aware generative classification. Results: Our approach achieves significant gains in context classification accuracy; establishes the first baseline for instance- and cluster-level object attribution prediction; and delivers zero-shot, context-aware performance improvements to state-of-the-art forgery detectors.

Technology Category

Application Category

📝 Abstract

We present Common Inpainted Objects In-N-Out of Context (COinCO), a novel dataset addressing the scarcity of out-of-context examples in existing vision datasets. By systematically replacing objects in COCO images through diffusion-based inpainting, we create 97,722 unique images featuring both contextually coherent and inconsistent scenes, enabling effective context learning. Each inpainted object is meticulously verified and categorized as in- or out-of-context through a multimodal large language model assessment. Our analysis reveals significant patterns in semantic priors that influence inpainting success across object categories. We demonstrate three key tasks enabled by COinCO: (1) training context classifiers that effectively determine whether existing objects belong in their context; (2) a novel Objects-from-Context prediction task that determines which new objects naturally belong in given scenes at both instance and clique levels, and (3) context-enhanced fake detection on state-of-the-art methods without fine-tuning. COinCO provides a controlled testbed with contextual variations, establishing a foundation for advancing context-aware visual understanding in computer vision and image forensics. Our code and data are at: https://github.com/YangTianze009/COinCO.

Problem

Research questions and friction points this paper is trying to address.

Addressing scarcity of out-of-context examples in vision datasets

Creating dataset with coherent and inconsistent scenes for context learning

Enabling context-aware tasks like classification and fake detection

Innovation

Methods, ideas, or system contributions that make the work stand out.

Diffusion-based inpainting for object replacement

Multimodal LLM for context verification

Context-aware tasks without fine-tuning

🔎 Similar Papers

Improving the Robustness of Object Detection and Classification AI models against Adversarial Patch Attacks