OmniConsistency: Learning Style-Agnostic Consistency from Paired Stylization Data - podcast episode cover

OmniConsistency: Learning Style-Agnostic Consistency from Paired Stylization Data

May 29, 2025•24 min•Ep. 823
--:--
--:--
Download Metacast podcast app
Listen to this episode in Metacast mobile app
Don't just listen to podcasts. Learn from them with transcripts, summaries, and chapters for every episode. Skim, search, and bookmark insights. Learn more

Episode description

🤗 Upvotes: 57 | cs.CV

Authors:
Yiren Song, Cheng Liu, Mike Zheng Shou

Title:
OmniConsistency: Learning Style-Agnostic Consistency from Paired Stylization Data

Arxiv:
http://arxiv.org/abs/2505.18445v1

Abstract:
Diffusion models have advanced image stylization significantly, yet two core challenges persist: (1) maintaining consistent stylization in complex scenes, particularly identity, composition, and fine details, and (2) preventing style degradation in image-to-image pipelines with style LoRAs. GPT-4o's exceptional stylization consistency highlights the performance gap between open-source methods and proprietary models. To bridge this gap, we propose \textbf{OmniConsistency}, a universal consistency plugin leveraging large-scale Diffusion Transformers (DiTs). OmniConsistency contributes: (1) an in-context consistency learning framework trained on aligned image pairs for robust generalization; (2) a two-stage progressive learning strategy decoupling style learning from consistency preservation to mitigate style degradation; and (3) a fully plug-and-play design compatible with arbitrary style LoRAs under the Flux framework. Extensive experiments show that OmniConsistency significantly enhances visual coherence and aesthetic quality, achieving performance comparable to commercial state-of-the-art model GPT-4o.

For the best experience, listen in Metacast app for iOS or Android