Google AI Infrastructure PM On New TPUs, Liquid Cooling and More

The New Stack Podcast

May 13, 2025•20 min•Ep. 1528

--:--

Listen in podcast apps:

Apple Podcasts

Spotify

Download

Listen to this episode in Metacast mobile app

Don't just listen to podcasts. Learn from them with transcripts, summaries, and chapters for every episode. Skim, search, and bookmark insights. Learn more

Episode description

At Google Cloud Next '25, the company introduced Ironwood, its most advanced custom Tensor Processing Unit (TPU) to date. With 9,216 chips per pod delivering 42.5 exaflops of compute power, Ironwood doubles the performance per watt compared to its predecessor. Senior product manager Chelsie Czop explained that designing TPUs involves balancing power, thermal constraints, and interconnectivity.

Google's long-term investment in liquid cooling, now in its fourth generation, plays a key role in managing the heat generated by these powerful chips. Czop highlighted the incremental design improvements made visible through changes in the data center setup, such as liquid cooling pipe placements. Customers often ask whether to use TPUs or GPUs, but the answer depends on their specific workloads and infrastructure. Some, like Moloco, have seen a 10x performance boost by moving directly from CPUs to TPUs. However, many still use both TPUs and GPUs. As models evolve faster than hardware, Google relies on collaborations with teams like DeepMind to anticipate future needs.

Learn more from The New Stack about the latest AI infrastructure insights from Google Cloud:

Google Cloud Therapist on Bringing AI to Cloud Native Infrastructure

A2A, MCP, Kafka and Flink: The New Stack for AI Agents

Join our community of newsletter subscribers to stay on top of the news and at the top of your game.

For the best experience, listen in Metacast app for iOS or Android