Chameleon's Multimodal Breakthrough, LoRA's Learning Efficiency, Many-Shot In-Context Learning, Object Detection Innovation, Text-to-3D Generation
May 18, 2024•11 min•Ep. 30
Episode description
Chameleon: Mixed-Modal Early-Fusion Foundation Models
LoRA Learns Less and Forgets Less
Many-Shot In-Context Learning in Multimodal Foundation Models
CAT3D: Create Anything in 3D with Multi-View Diffusion Models
Grounding DINO 1.5: Advance the "Edge" of Open-Set Object Detection
Dual3D: Efficient and Consistent Text-to-3D Generation with Dual-mode
For the best experience, listen in Metacast app for iOS or Android
