s1: simple test time scaling
Mar 13, 2025•5 min
Episode description
- Test-time scaling improves language model performance using extra compute
- A dataset of 1,000 questions was curated for validation
- Budget forcing controls compute by managing the model's reasoning process
- The model outperformed o1-preview by up to 27% on math questions
- The model and data are open-source for public access
For the best experience, listen in Metacast app for iOS or Android
