RL Post-training Amplifies Pretraining Behaviors in Language Models - podcast episode cover

RL Post-training Amplifies Pretraining Behaviors in Language Models

Apr 14, 202516 min
--:--
--:--
Download Metacast podcast app
Listen to this episode in Metacast mobile app
Don't just listen to podcasts. Learn from them with transcripts, summaries, and chapters for every episode. Skim, search, and bookmark insights. Learn more

Episode description

This paper investigates how reinforcement learning (RL) fine-tuning impacts language models' mathematical reasoning abilities, focusing on the influence of the pretraining data. The authors trained models from scratch on diverse open-source datasets and then applied various RL algorithms. Their findings reveal that RL post-training tends to amplify patterns from a single pretraining data distribution, often improving performance but reducing output diversity. Interestingly, the favored output format after RL depends on the model's scale, with smaller models preferring code-like formats and larger models leaning towards natural language. Furthermore, the study shows that RL fine-tuning on simpler problems can lead to performance gains on more challenging, unseen mathematical tasks, suggesting a positive transfer of reasoning capabilities.

For the best experience, listen in Metacast app for iOS or Android