Self-Consuming Generative Models with Curated Data

Best AI papers explained

May 02, 2025•17 min

--:--

Listen in podcast apps:

Apple Podcasts

Spotify

Download

Listen to this episode in Metacast mobile app

Don't just listen to podcasts. Learn from them with transcripts, summaries, and chapters for every episode. Skim, search, and bookmark insights. Learn more

Episode description

This paper examines how curation of synthetic data, often reflecting human preferences, impacts the iterative retraining of generative models. The authors theoretically demonstrate that when generative models are trained on curated synthetic samples, the expected reward associated with the curation process increases, and its variance diminishes, leading to the model converging towards data maximizing that reward. However, this can also result in bias amplification, as shown through experiments. Stability guarantees are provided when mixing real and curated synthetic data during retraining, drawing connections to Reinforcement Learning from Human Feedback (RLHF), where the models implicitly optimize preferences. This research highlights that the increasing presence of curated synthetic data online acts as an implicit mechanism for preference optimization in future generative models.

For the best experience, listen in Metacast app for iOS or Android