Pathways on the Image Manifold: Image Editing via Video Generation - podcast episode cover

Pathways on the Image Manifold: Image Editing via Video Generation

Nov 28, 2024•25 min•Ep. 157
--:--
--:--
Download Metacast podcast app
Listen to this episode in Metacast mobile app
Don't just listen to podcasts. Learn from them with transcripts, summaries, and chapters for every episode. Skim, search, and bookmark insights. Learn more

Episode description

🤗 Paper Upvotes: 23 | cs.CV, cs.AI, cs.LG

Authors:
Noam Rotstein, Gal Yona, Daniel Silver, Roy Velich, David Bensaïd, Ron Kimmel

Title:
Pathways on the Image Manifold: Image Editing via Video Generation

Arxiv:
http://arxiv.org/abs/2411.16819v1

Abstract:
Recent advances in image editing, driven by image diffusion models, have shown remarkable progress. However, significant challenges remain, as these models often struggle to follow complex edit instructions accurately and frequently compromise fidelity by altering key elements of the original image. Simultaneously, video generation has made remarkable strides, with models that effectively function as consistent and continuous world simulators. In this paper, we propose merging these two fields by utilizing image-to-video models for image editing. We reformulate image editing as a temporal process, using pretrained video models to create smooth transitions from the original image to the desired edit. This approach traverses the image manifold continuously, ensuring consistent edits while preserving the original image's key aspects. Our approach achieves state-of-the-art results on text-based image editing, demonstrating significant improvements in both edit accuracy and image preservation.

For the best experience, listen in Metacast app for iOS or Android