Language Model Mysteries, Personalized Image Generation, Audio-Visual Transformer Innovations, DeepSeek-Prover, Dense Connector: MLLM Potential
May 24, 2024•11 min•Ep. 34
Episode description
ReVideo: Remake a Video with Motion and Content Control
Not All Language Model Features Are Linear
RectifID: Personalizing Rectified Flow with Anchored Classifier Guidance
Visual Echoes: A Simple Unified Transformer for Audio-Visual Generation
DeepSeek-Prover: Advancing Theorem Proving in LLMs through Large-Scale
Synthetic Data
Dense Connector for MLLMs
For the best experience, listen in Metacast app for iOS or Android
