Language Model Mysteries, Personalized Image Generation, Audio-Visual Transformer Innovations, DeepSeek-Prover, Dense Connector: MLLM Potential

AI Papers Podcast

May 24, 2024•11 min•Ep. 34

--:--

Listen in podcast apps:

Apple Podcasts

Spotify

Download

Listen to this episode in Metacast mobile app

Don't just listen to podcasts. Learn from them with transcripts, summaries, and chapters for every episode. Skim, search, and bookmark insights. Learn more

Episode description

ReVideo: Remake a Video with Motion and Content Control Not All Language Model Features Are Linear RectifID: Personalizing Rectified Flow with Anchored Classifier Guidance Visual Echoes: A Simple Unified Transformer for Audio-Visual Generation DeepSeek-Prover: Advancing Theorem Proving in LLMs through Large-Scale Synthetic Data Dense Connector for MLLMs

For the best experience, listen in Metacast app for iOS or Android