Google I/O Afterparty: The Future of Human-AI Collaboration, From Veo to Mariner
Jun 03, 2025•54 min
Episode description
Fresh off impressive releases at Google’s I/O event, three Google Labs leaders explain how they’re reimagining creative tools and productivity workflows. Thomas Iljic details how video generation is merging filmmaking with gaming through generative AI cameras and world-building interfaces in Whisk and Veo. Jaclyn Konzelmann demonstrates how Project Mariner evolved from a disruptive browser takeover to an intelligent background assistant that remembers context across multiple tasks. Simon Tokumine reveals NotebookLM’s expansion beyond viral audio overviews into a comprehensive platform for transforming information into personalized formats. The conversation explores the shift from prompting to showing and telling, the economics of AI-powered e-commerce, and why being “too early” has become Google Labs’ biggest challenge and advantage.
Hosted by Sonya Huang, Sequoia Capital
00:00 Introduction
02:12 Google's AI models and public perception
04:18 Google's history in image and video generation
06:45 Where Whisk and Flow fit
10:30 How close are we to having the ideal tool for the craft?
13:05 Where do the movie and game worlds start to merge?
16:25 Introduction to Project Mariner
17:15 How Mariner works
22:34 Mariner user behaviors
27:07 Temporary tattoos and URL memory
27:53 Project Mariner's future
29:26 Agent capabilities and use cases
31:09 E-commerce and agent interaction
35:03 Notebook LM evolution
48:26 Predictions and future of AI
Mentioned in this episode:
Whisk: Image and video generation app for consumers
Flow: AI-powered filmmaking with new Veo 3 model
Project Mariner: research prototype exploring the future of human-agent interaction, starting with browsers
NotebookLM: tool for understanding and engaging with complex information including Audio Overviews and now a mobile app
Shop with AI Mode: Shopping app with a virtual try-on tool based on your own photos
Stitch: New prompt-based interface to design UI for mobile and web applications.
ControlNet paper: Outlined an architecture for adding conditional language to direct the outputs of image generation with diffusion models
For the best experience, listen in Metacast app for iOS or Android
Open in Metacast