Improving Agent Design, JPEG-LM's Visual Breakthrough, TurboEdit's Real-Time Image Edits, Video Segmentation Advances, LLMs Learning Like Humans, RL Benchmarks
Aug 21, 2024•16 min•Ep. 70
Episode description
xGen-MM (BLIP-3): A Family of Open Large Multimodal Models
JPEG-LM: LLMs as Image Generators with Canonical Codec Representations
Automated Design of Agentic Systems
TurboEdit: Instant text-based image editing
Surgical SAM 2: Real-time Segment Anything in Surgical Video by
Efficient Frame Pruning
Fine-tuning Large Language Models with Human-inspired Learning
Strategies in Medical Question Answering
D5RL: Diverse Datasets for Data-Driven Deep Reinforcement Learning
For the best experience, listen in Metacast app for iOS or Android
