Common Inpainted Objects In-N-Out of Context

📅 2025-05-31
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing vision datasets lack out-of-context samples, hindering research in context-aware visual understanding and image forensics. Method: We introduce COinCO, the first large-scale, systematically annotated dataset of context-consistent images (97,722 samples), generated by semantically controlled object replacement on COCO using diffusion models and fine-grained plausibility annotation via multimodal large language models. We propose the novel “Objects-from-Context” prediction task and develop a context-enhanced zero-shot forgery detection framework—requiring no fine-tuning—by integrating diffusion-based inpainting, multimodal plausibility assessment, semantic prior modeling, and context-aware generative classification. Results: Our approach achieves significant gains in context classification accuracy; establishes the first baseline for instance- and cluster-level object attribution prediction; and delivers zero-shot, context-aware performance improvements to state-of-the-art forgery detectors.

Technology Category

Application Category

📝 Abstract
We present Common Inpainted Objects In-N-Out of Context (COinCO), a novel dataset addressing the scarcity of out-of-context examples in existing vision datasets. By systematically replacing objects in COCO images through diffusion-based inpainting, we create 97,722 unique images featuring both contextually coherent and inconsistent scenes, enabling effective context learning. Each inpainted object is meticulously verified and categorized as in- or out-of-context through a multimodal large language model assessment. Our analysis reveals significant patterns in semantic priors that influence inpainting success across object categories. We demonstrate three key tasks enabled by COinCO: (1) training context classifiers that effectively determine whether existing objects belong in their context; (2) a novel Objects-from-Context prediction task that determines which new objects naturally belong in given scenes at both instance and clique levels, and (3) context-enhanced fake detection on state-of-the-art methods without fine-tuning. COinCO provides a controlled testbed with contextual variations, establishing a foundation for advancing context-aware visual understanding in computer vision and image forensics. Our code and data are at: https://github.com/YangTianze009/COinCO.
Problem

Research questions and friction points this paper is trying to address.

Addressing scarcity of out-of-context examples in vision datasets
Creating dataset with coherent and inconsistent scenes for context learning
Enabling context-aware tasks like classification and fake detection
Innovation

Methods, ideas, or system contributions that make the work stand out.

Diffusion-based inpainting for object replacement
Multimodal LLM for context verification
Context-aware tasks without fine-tuning
🔎 Similar Papers
No similar papers found.