Actor-Critic without Actor: Critic-Guided Denoising for RL

Best AI papers explained

Sep 29, 2025•16 min

--:--

Listen in podcast apps:

Apple Podcasts

Spotify

Download

Listen to this episode in Metacast mobile app

Don't just listen to podcasts. Learn from them with transcripts, summaries, and chapters for every episode. Skim, search, and bookmark insights. Learn more

Episode description

This paper introduces a novel reinforcement learning framework called Actor-Critic without Actor (ACA), which is designed to be a lightweight and efficient alternative to traditional actor-critic methods. ACA eliminates the explicit actor network, generating actions instead from the gradient field of a noise-level critic via a diffusion-based denoising process. This method significantly reduces algorithmic and computational overhead compared to standard and diffusion-based actor-critic approaches, as demonstrated by requiring substantially fewer parameters and achieving competitive performance on online RL benchmarks like MuJoCo tasks. A key feature of ACA is its noise-level critic, which conditions value estimates on the diffusion timestep, stabilizing gradients and ensuring the policy maintains immediate alignment with the critic's latest value updates while preserving multi-modal action coverage. Overall, ACA offers a simplified, expressive, and parameter-efficient solution for online reinforcement learning.

For the best experience, listen in Metacast app for iOS or Android