Controllable Segmentation-Based Text-Guided Style Editing

📅 2025-03-20
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses text-guided region-controllable style editing—specifically, applying a target style (e.g., cyberpunk) to a designated object (e.g., a building) while preserving other regions (e.g., people, trees). To this end, we propose a novel method integrating semantic segmentation with state-space modeling. We introduce fine-grained segmentation guidance into text-driven editing for the first time, designing region-conditioned text embeddings and region-directed adversarial loss to jointly ensure semantic boundary consistency and precise alignment with user intent. Our framework builds upon Mask2Former for segmentation and StyleMamba—a state-space model—for stylization. Evaluated on real-world complex scenes, our approach significantly improves edit controllability and visual fidelity: PSNR increases by 2.1 dB over global style transfer, and user satisfaction rises by 37%.

Technology Category

Application Category

📝 Abstract
We present a novel approach for controllable, region-specific style editing driven by textual prompts. Building upon the state-space style alignment framework introduced by emph{StyleMamba}, our method integrates a semantic segmentation model into the style transfer pipeline. This allows users to selectively apply text-driven style changes to specific segments (e.g., ``turn the building into a cyberpunk tower'') while leaving other regions (e.g., ``people'' or ``trees'') unchanged. By incorporating region-wise condition vectors and a region-specific directional loss, our method achieves high-fidelity transformations that respect both semantic boundaries and user-driven style descriptions. Extensive experiments demonstrate that our approach can flexibly handle complex scene stylizations in real-world scenarios, improving control and quality over purely global style transfer methods.
Problem

Research questions and friction points this paper is trying to address.

Enables region-specific style editing using text prompts.
Integrates semantic segmentation for selective style application.
Improves fidelity and control in complex scene stylizations.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Integrates semantic segmentation for style editing
Uses region-wise condition vectors for transformations
Applies region-specific directional loss for fidelity
🔎 Similar Papers
No similar papers found.
Jingwen Li
Jingwen Li
Sichuan Normal University
Learning to OptimizeDeep Reinforcement LearningCombinatorial Optimization Problems
A
Aravind Chandrasekar
School of AI and Robotics, AuroraTech Labs, USA
M
Mariana Rocha
Vision Intelligence Group, Federal University of Rio de Janeiro, Brazil
C
Chao Li
Dept. of Computer Science, Eastern Asia Institute of Technology, Beijing, China
Yuqing Chen
Yuqing Chen
Department of Computer Science, Fudan University, China