Training Data - podcast cover

Training Data

Sequoia Capitalwww.sequoiacap.com
Join us as we train our neural nets on the theme of the century: AI. Sonya Huang, Pat Grady and more Sequoia Capital partners host conversations with leading AI builders and researchers to ask critical questions and develop a deeper understanding of the evolving technologies—and their implications for technology, business and society. The content of this podcast does not constitute investment advice, an offer to provide investment advisory services, or an offer to sell or solicitation of an offer to buy an interest in any investment fund.

Episodes

Arc Institute's Patrick Hsu on Building an App Store for Biology with AI

Patrick Hsu, co-founder of Arc Institute, discusses the opportunities for AI in biology beyond just drug development, and how Evo 2, their new biology foundation model, is enabling a broad ecosystem of applications. Evo 2 was trained on a vast dataset of genomic data to learn evolutionary patterns that would have taken years to find; as a result, the model can be used for applications from identifying mutations that cause disease to designing new molecular and even genome scale biological system...

Apr 15, 202558 min

Replit CEO Amjad Masad on 1 Billion Developers: A Better End State than AGI?

Amjad Masad set out more than a decade ago to pursue the dream of unleashing 1B software creators around the world. With millions of Replit users pre-ChatGPT, that vision was already becoming a reality. Turbocharged by LLMs, the vision of enabling anyone to code—from 12-year-olds in India to knowledge workers in the U.S.—seems less and less radical. In this episode, Amjad explains how an explosion in the developer population could change the economy, society and more. He also discusses his early...

Apr 08, 20251 hr 26 min

Why CRM Needs an AI Revolution, with Day.ai Founder Christopher O’Donnell

Christopher O’Donnell believes the fundamental problems with CRM—incomplete data, complex workflows, siloed work products and the fear of leads falling through the cracks—can finally be solved through AI. Founder of Day.ai and former Chief Product Officer of HubSpot, Christopher explains how his team is building a system that automatically captures the full context of customer relationships while giving users transparency and control. He shares lessons from building HubSpot’s CRM and why he’s ta...

Apr 01, 20251 hr 11 min

From Software Engineers to AI Word Artisans: Filip Kozera of Wordware

Filip Kozera sees parallels between Excel’s democratization of data analytics and Wordware’s mission to put AI development in the hands of knowledge workers. Drawing inspiration from Excel’s 750 million users (compared to 30 million software developers), Wordware is creating tools that balance the rigid structure of programming with the fuzziness of natural language. Filip explains why effective AI development requires working across multiple abstraction layers—from high-level concepts to detail...

Mar 25, 202543 min

Josh Woodward: Google Labs is Rapidly Building AI Products from 0-to-1

As VP of Google Labs, Josh Woodward leads teams exploring the frontiers of AI applications. He shares insights on their rapid development process, why today’s written prompts will become outdated and how AI is transforming everything from video generation to computer control. He reveals that 25% of Google’s code is now written by AI and explains why coding could see major leaps forward this year. He emphasizes the importance of taste, design and human values in building AI tools that will shape ...

Mar 18, 202551 min

How AI Breakout Harvey is Transforming Legal Services, with CEO Winston Weinberg

Harvey CEO Winston Weinberg explains why success in legal AI requires more than just model capabilities—it demands deep process expertise that doesn’t exist online. He shares how Harvey balances rapid product development with earning trust from law firms through hyper-personalized demos and deep industry expertise. The discussion covers Harvey’s approach to product development—expanding specialized capabilities then collapsing them into unified workflows—and why focusing on complex work like int...

Mar 11, 202554 min

The AI Product Going Viral With Doctors: OpenEvidence, with CEO Daniel Nadler

OpenEvidence is transforming how doctors access medical knowledge at the point of care, from the biggest medical establishments to small practices serving rural communities. Founder Daniel Nadler explains his team’s insight that training smaller, specialized AI models on peer-reviewed literature outperforms large general models for medical applications. He discusses how making the platform freely available to all physicians led to widespread organic adoption and strategic partnerships with publi...

Mar 04, 20251 hr 5 min

OpenAI’s Deep Research Team on Why Reinforcement Learning is the Future for AI Agents

OpenAI’s Isa Fulford and Josh Tobin discuss how the company’s newest agent, Deep Research, represents a breakthrough in AI research capabilities by training models end-to-end rather than using hand-coded operational graphs. The product leads explain how high-quality training data and the o3 model’s reasoning abilities enable adaptable research strategies, and why OpenAI thinks Deep Research will capture a meaningful percentage of knowledge work. Key product decisions that build transparency and ...

Feb 25, 202533 min

Palo Alto Networks’ Nikesh Arora: AI, Security and the New World Order

Palo Alto Networks’s CEO Nikesh Arora dispels DeepSeek hype by detailing all of the guardrails enterprises need to have in place to give AI agents “arms and legs.” No matter the model, deploying applications for precision-use cases means superimposing better controls. Arora emphasizes that the real challenge isn’t just blocking threats but matching the accelerated pace of AI-powered attacks, requiring a fundamental shift from prevention-focused to real-time detection and response systems. CISOs ...

Feb 18, 20251 hrSeason 1Ep. 30

MongoDB’s Sahir Azam: Vector Databases and the Data Structure of AI

MongoDB product leader Sahir Azam explains how vector databases have evolved from semantic search to become the essential memory and state layer for AI applications. He describes his view of how AI is transforming software development generally, and how combining vectors, graphs and traditional data structures enables high-quality retrieval needed for mission-critical enterprise AI use cases. Drawing from MongoDB's successful cloud transformation, Azam shares his vision for democratizing AI deve...

Feb 13, 202544 min

Roblox Studio Head Stef Corazza: Using AI to Empower Creators

Stef Corazza leads generative AI development at Roblox after previously building Adobe’s 3D and AR platforms. His technical expertise, combined with Roblox’s unique relationship with its users, has led to the infusion of AI into its creation tools. Roblox has assembled the world’s largest multimodal dataset. Stef previews the Roblox Assistant and the company’s new 3D foundation model, while emphasizing the importance of maintaining positive experiences and civility on the platform. Mentioned in ...

Feb 04, 202555 min

ReflectionAI Founder Ioannis Antonoglou: From AlphaGo to AGI

Ioannis Antonoglou, founding engineer at DeepMind and co-founder of ReflectionAI, has seen the triumphs of reinforcement learning firsthand. From AlphaGo to AlphaZero and MuZero, Ioannis has built the most powerful agents in the world. Ioannis breaks down key moments in AlphaGo's game against Lee Sodol (Moves 37 and 78), the importance of self-play and the impact of scale, reliability, planning and in-context learning as core factors that will unlock the next level of progress in AI. Hosted by: ...

Jan 28, 202552 min

Kumo’s Hema Raghavan: Turning Graph AI into ROI

Hema Raghavan is co-founder of Kumo, a company that makes graph neural networks accessible to enterprises by connecting to their relational data stored in Snowflake and Databricks. Hema talks about how running GNNs on GPUs has led to breakthroughs in performance as well as the query language Kumo developed to help companies predict future data points. Although approachable for non-technical users, the product provides full control for data scientists who use Kumo to automate time-consuming featu...

Jan 21, 202552 min

Databricks Founder Ion Stoica: Turning Academic Open Source into Startup Success

Berkeley professor Ion Stoica, co-founder of Databricks and Anyscale, transformed the open source projects Spark and Ray into successful AI infrastructure companies. He talks about what mattered most for Databricks' success -- the focus on making Spark win and making Databricks the best place to run Spark. He highlights the importance of striking key partnerships -- the Microsoft partnership in particular that accelerated Databricks' growth and contributed to Spark's dominance among data scienti...

Jan 14, 20251 hr

XBOW CEO and GitHub Copilot Creator Oege de Moor: Cracking the Code on Offensive Security With AI

Oege de Moor, the creator of GitHub Copilot, discusses how XBOW’s AI offensive security system matches and even outperforms top human penetration testers, completing security assessments in minutes instead of days. The team’s speed and focus is transforming the niche market of pen testing with an always-on service-as-a-software platform. Oege describes how he is building a large and sustainable business while also creating a product that will “protect all the software in the free world.” XBOW sh...

Dec 10, 202452 min

Ramp CEO Eric Glyman: Using AI to Build “Self-Driving Money”

When ChatGPT ushered in a new paradigm of AI in everyday use, many companies attempted to adapt to the new paradigm by rushing to add chat interfaces to their products. Eric has a different take—he doesn’t think chatbots are the right form factor for everything. He thinks “zero-touch” automation that works invisibly in the background can be more valuable in many cases. He cites self-driving cars as an analogy—or in this case, “self-driving money.” Ramp is a new kind of finance management company...

Dec 03, 202439 min

Dust’s Gabriel Hubert and Stanislas Polu: Getting the Most From AI With Multiple Custom Agents

Founded in early 2023 after spending years at Stripe and OpenAI, Gabriel Hubert and Stanislas Polu started Dust with the view that one model will not rule them all, and that multi-model integration will be key to getting the most value out of AI assistants. In this episode we’ll hear why they believe the proprietary data you have in silos will be key to unlocking the full power of AI, get their perspective on the evolving model landscape, and how AI can augment rather than replace human capabili...

Nov 26, 20241 hr 3 min

Clay’s Kareem Amin on Building the Sales ‘System of Action’ with AI

Clay is leveraging AI to help go-to-market teams unleash creativity and be more effective in their work, powering custom workflows for everything from targeted outreach to personalized landing pages. It’s one of the fastest growing AI-native applications, with over 4,500 customers and 100,000 users. Founder and CEO Kareem Amin describes Clay’s technology, and its approach to balancing imagination and automation in order to help its customers achieve new levels of go-to-market success. Hosted by:...

Nov 19, 202452 min

Decart’s Dean Leitersdorf on AI-Generated Video Games and Worlds

Can GenAI allow us to connect our imagination to what we see on our screens? Decart’s Dean Leitersdorf believes it can. In this episode, Dean Leitersdorf breaks down how Decart is pushing the boundaries of compute in order to create AI-generated consumer experiences, from fully playable video games to immersive worlds. From achieving real-time video inference on existing hardware to building a fully vertically integrated stack, Dean explains why solving fundamental limitations rather than specif...

Nov 13, 202447 min

How Glean CEO Arvind Jain Solved the Enterprise Search Problem – and What It Means for AI at Work

Years before co-founding Glean, Arvind was an early Google employee who helped design the search algorithm. Today, Glean is building search and work assistants inside the enterprise, which is arguably an even harder problem. One of the reasons enterprise search is so difficult is that each individual at the company has different permissions and access to different documents and information, meaning that every search needs to be fully personalized. Solving this difficult ingestion and ranking pro...

Oct 29, 202445 min

OpenAI Researcher Dan Roberts on What Physics Can Teach Us About AI

In recent years there’s been an influx of theoretical physicists into the leading AI labs. Do they have unique capabilities suited to studying large models or is it just herd behavior? To find out, we talked to our former AI Fellow (and now OpenAI researcher) Dan Roberts. Roberts, co-author of The Principles of Deep Learning Theory, is at the forefront of research that applies the tools of theoretical physics to another type of large complex system, deep neural networks. Dan believes that DLLs, ...

Oct 22, 202442 min

Google NotebookLM’s Raiza Martin and Jason Spielman on Creating Delightful AI Podcast Hosts and the Potential for Source-Grounded AI

NotebookLM from Google Labs has become the breakout viral AI product of the year. The feature that catapulted it to viral fame is Audio Overview, which generates eerily realistic two-host podcast audio from any input you upload—written doc, audio or video file, or even a PDF. But to describe NotebookLM as a “podcast generator” is to vastly undersell it. The real magic of the product is in offering multi-modal dimensions to explore your own content in new ways—with context that’s surprisingly add...

Oct 15, 202432 min

Snowflake CEO Sridhar Ramaswamy on Using Data to Create Simple, Reliable AI for Businesses

All of us as consumers have felt the magic of ChatGPT—but also the occasional errors and hallucinations that make off-the-shelf language models problematic for business use cases with no tolerance for errors. Case in point: A model deployed to help create a summary for this episode stated that Sridhar Ramaswamy previously led PyTorch at Meta. He did not. He spent years running Google’s ads business and now serves as CEO of Snowflake, which he describes as the data cloud for the AI era. Ramaswamy...

Oct 08, 202459 min

OpenAI's Noam Brown, Ilge Akkaya and Hunter Lightman on o1 and Teaching LLMs to Reason Better

Combining LLMs with AlphaGo-style deep reinforcement learning has been a holy grail for many leading AI labs, and with o1 (aka Strawberry) we are seeing the most general merging of the two modes to date. o1 is admittedly better at math than essay writing, but it has already achieved SOTA on a number of math, coding and reasoning benchmarks. Deep RL legend and now OpenAI researcher Noam Brown and teammates Ilge Akkaya and Hunter Lightman discuss the ah-ha moments on the way to the release of o1, ...

Oct 02, 202445 min

Why Vlad Tenev and Tudor Achim of Harmonic Think AI Is About to Change Math—and Why It Matters

Adding code to LLM training data is a known method of improving a model’s reasoning skills. But wouldn’t math, the basis of all reasoning, be even better? Up until recently, there just wasn’t enough usable data that describes mathematics to make this feasible. A few years ago, Vlad Tenev (also founder of Robinhood) and Tudor Achim noticed the rise of the community around an esoteric programming language called Lean that was gaining traction among mathematicians. The combination of that and the p...

Sep 24, 202440 min

Jim Fan on Nvidia’s Embodied AI Lab and Jensen Huang’s Prediction that All Robots will be Autonomous

AI researcher Jim Fan has had a charmed career. He was OpenAI’s first intern before he did his PhD at Stanford with “godmother of AI,” Fei-Fei Li. He graduated into a research scientist position at Nvidia and now leads its Embodied AI “GEAR” group. The lab’s current work spans foundation models for humanoid robots to agents for virtual worlds. Jim describes a three-pronged data strategy for robotics, combining internet-scale data, simulation data and real world robot data. He believes that in th...

Sep 17, 202449 min

Founder Eric Steinberger on Magic’s Counterintuitive Approach to Pursuing AGI

There’s a new archetype in Silicon Valley, the AI researcher turned founder. Instead of tinkering in a garage they write papers that earn them the right to collaborate with cutting-edge labs until they break out and start their own. This is the story of wunderkind Eric Steinberger, the founder and CEO of Magic.dev. Eric came to programming through his obsession with AI and caught the attention of DeepMind researchers as a high school student. In 2022 he realized that AGI was closer than he had p...

Sep 10, 202451 min

Crucible Moments Returns for S2: The ServiceNow Story ft. CEO Frank Slootman & Founder Fred Luddy

On Training Data, we learn from innovators pushing forward the frontier of AI’s capabilities. Today we’re bringing you something different. It’s the story of a company currently implementing AI at scale in the enterprise, and how it was built from a bootstrapped idea in the pre-AI era to a 150 billion dollar market cap giant. It’s the Season 2 premiere of Sequoia’s other podcast, Crucible Moments, where we hear from the founders and leaders of some legendary companies about the crossroads and in...

Sep 03, 202443 min

Sierra Co-Founder Clay Bavor on Making Customer-Facing AI Agents Delightful

Customer service is hands down the first killer app of generative AI for businesses. The reasons are simple: the costs of existing solutions are so high, the satisfaction so low and the margin for ROI so wide. But trusting your interactions with customers to hallucination-prone LLMs can be daunting. Enter Sierra. Co-founder Clay Bavor walks us through the sophisticated engineering challenges his team solved along the way to delivering AI agents for all aspects of the customer experience that are...

Aug 27, 20241 hr 13 min

Phaidra’s Jim Gao on Building the Fourth Industrial Revolution with Reinforcement Learning

After AlphaGo beat Lee Sedol, a young mechanical engineer at Google thought of another game reinforcement learning could win: energy optimization at data centers. Jim Gao convinced his bosses at the Google data center team to let him work with the DeepMind team to try. The initial pilot resulted in a 40% energy savings and led he and his co-founders to start Phaidra to turn this technology into a product. Jim discusses the challenges of AI readiness in industrial settings and how we have to buil...

Aug 20, 202451 min