Block Transformers: Faster Inference, Mobile Device AI Agents, 3D-Image Generation, Low Latency TTS
Jun 10, 2024•11 min•Ep. 45
Episode description
Block Transformer: Global-to-Local Language Modeling for Fast Inference
Parrot: Multilingual Visual Instruction Tuning
Mobile-Agent-v2: Mobile Device Operation Assistant with Effective
Navigation via Multi-Agent Collaboration
Ouroboros3D: Image-to-3D Generation via 3D-aware Recursive Diffusion
LiveSpeech: Low-Latency Zero-shot Text-to-Speech via Autoregressive
Modeling of Audio Discrete Codes
For the best experience, listen in Metacast app for iOS or Android
