What does it mean to change the economics of software development? Nathan Marz joins the show to share how they reduced the cost of building Mastodon at Twitter-scale by 100X and the 10 years journey to build Rama, a new programming platform that made this feat possible. Nathan is the founder of Red Planet Labs. Prior to RPL, he led engineering for BackType which was acquired by Twitter in 2011. Nathan created the Apache Storm project and wrote the book Big Data: Principles and best practices of...
Sep 22, 2023•2 hr 33 min•Transcript available on Metacast Kelsey Hightower was a Distinguished Engineer at Google, where he worked on Google Cloud Platform. In this second part of the conversation, we focus on Kelseys retirement - the financial planning that enabled him to retire at 42, how he got started advising startups and his perspectives on compensation, turning down a substantial offer from Microsoft and meeting Satya Nadella in person. And, of course, plans for the future.
Aug 03, 2023•2 hr 31 min•Transcript available on Metacast Kelsey Hightower was a Distinguished Engineer at Google, where he worked on Google Cloud Platform. In this first part of the conversation, we delve into pivotal moments in Kelseys career journey ranging from buying his first car by working at mcdonalds after school, to starting his own computer store that turned into a music studio after 6pm, to hacking on python infrastructure with the core developers. Through these stories, we learned a ton about how Kelsey thinks about acquiring new skills - ...
Jul 24, 2023•1 hr 6 min•Transcript available on Metacast Julie Amundson is a Sr Staff Software Engineer at Google working on Machine Learning Infrastructure. Prior to Google, she was the Director of Machine Learning Infrastructure at Netflix. Julie decided to take a career break last year when she was affected by mass layoffs. In this conversation, we talk to her about what it was like to find a job during hiring freezes, what it was like to position herself in this market, whether the interviewers cared about the career break she took and how the car...
Jun 27, 2023•57 min•Ep 20•Transcript available on Metacast Chris Pruett is the CTO and Co-founder of Jam - a new way to share and listen to bite-sized audio. Prior to Jam, Chris spent 9+ years at LinkedIn growing from an engineering manager to VP of Engineering. During his tenure at LinkedIn, he worked on almost all aspects of the app and towards the end, led an org of 500+ engineers working on Feed, Messaging, Identity and Search. In this episode, we discuss how he made the decision to leave his leadership position at LinkedIn and co-found Jam. We also...
Jun 03, 2022•1 hr 20 min•Ep 19•Transcript available on Metacast Short episode about reflections on the past year and plans for 2022.
Mar 25, 2022•8 min•Transcript available on Metacast Kailash is the CTO at Zerodha , the largest stock broker in India. In this conversation, we speak with him about absurdism - a philosophy that guides his personal and professional worldview. We discuss how he built Zerodhas tech team, their team culture and how the team operates so efficiently while being so lean. We also discuss why Zerodha self-hosts all of their tech stack, what they look for when hiring engineers and how their systems scaled when the user base grew from 2 to 8 million in 18 ...
Feb 25, 2022•2 hr 38 min•Ep 19•Transcript available on Metacast Michael Lynch is the founder of TinyPilot . After doing software engineering at Microsoft and Google for 7 years, Michael decided in 2018 to quit and start working for himself by building small software businesses. From years of negative profit to now building a $450K+ ARR hardware business, Michael joins the show to chat about what made him quit his cushy job at Google, how he builds in public with monthly retrospectives, what he has learned over the 3 years indie hacking and much more....
Jan 14, 2022•2 hr 47 min•Ep 17•Transcript available on Metacast Cory is currently a Solutions Engineer at Jeli.io and very well known in the community for his work on Observability. His career in observability began at Twitter where he managed the observability team and then he joined Stripe, where he created and led the observability team, this time around as a Principal Engineer. We talk to him about how he got his start in customer support and the role it played in the later part of his career. We discuss his time at Twitter where there was a power outage...
Nov 12, 2021•1 hr 24 min•Ep 17•Transcript available on Metacast Ashwin is a Startup Partnership Lead at Stripe. From web development to co-founding a YC startup, to deep learning, Ashwin has a knack for picking up new skills extremely quickly. In this episode, we chat about the methods he employed to successfully make these transitions, learnings/tips from winning 30+ hackathons in a row, and what engineers can gain from better story-telling.
Oct 12, 2021•1 hr 14 min•Ep 16•Transcript available on Metacast Bruno Connelly is a VP of Engineering at LinkedIn. He leads the Site Engineering org responsible for LinkedIn's production infrastructure. He joins the show to talk about his journey in tech - from teaching himself how to code at a young age, building, maintaining and reverse engineering software as a teenager, building ISPs in the early part of his career (there are some fun stories that involve sleeping in the data center) to leading the SRE org at LinkedIn over the last decade. He talks about...
Sep 12, 2021•1 hr 6 min•Ep 15•Transcript available on Metacast With 5+ years of experience building resilient systems at the Netflix scale, Lorin joins the show to chat about his favorite incident story, the path that led him to doing chaos engineering (and later away from it), and advocating for a dedicated analyst to talk to people after an incident. Throughout the conversation, Lorin shares his philosophy and tips on how to learn from incidents, what engineers can gain from writing better, and why some metrics may not be as useful as you think....
Aug 14, 2021•1 hr 24 min•Ep 14•Transcript available on Metacast Spoons is the Co-founder and Chief Architect of Lightstep. He joins the show to talk about building systems at Google scale and various aspects that make Google a weird place than other companies. We talked about Spoons's journey of leaving Google and deciding to join Lightstep as a co-founder. We dig into the challenges during the early days of Lightstep and discuss the importance of speaking to customers to build the right product. We talk about what it's like to start a family and run a start...
Jul 09, 2021•1 hr 14 min•Ep 13•Transcript available on Metacast Having led 100+ ML projects at Insight and built ML systems at Stripe scale, Emmanuel joins the show to chat about how to build useful ML products and what happens next when the model is in production. Throughout the conversation, Manu shares stories and advice on topics like the common mistakes people make when starting a new ML project, whats similar and different about the lifecycle of ML systems compared to traditional software, and writing a technical book....
Jun 11, 2021•1 hr 13 min•Ep 12•Transcript available on Metacast Todd is a Sr Director of Engineering at Google where he leads Site Reliability Engineering teams for Machine Learning. Having recently presented on how ML breaks in production , by examining more than a decade of outage postmortems at Google, Todd joins the show to chat about why many ways that ML systems break in production have nothing to do with ML, whats different about engineering reliable systems for ML, vs traditional software (and the many ways that they are similar), what he looks for w...
May 07, 2021•1 hr 8 min•Ep 11•Transcript available on Metacast Evan Estola ( https://twitter.com/estola ) is a Director of Engineering at Flatiron Health where he's leading software engineering teams focused on building Machine Learning products. Throughout this episode, Evan shares various stories when recommendation systems didnt work as expected, like this one time when members saw mathematically worst recommendations for meetups near them. He also shares why Schenectady, NY pops up on some lists of most popular cities and the story behind the Wall Stree...
Apr 23, 2021•1 hr 13 min•Ep 10•Transcript available on Metacast Uma is a VP of Engineering at Render. In this episode, she shared with us her insights on how to successfully manage infrastructure migrations. We discussed the importance of communicating the "why" behind a migration, identifying success metrics, creating a culture where migrations are identified as highly impactful projects and much more. Uma also shared stories where parts of a migration didnt go as planned, how the team fixed the issue and the kind of engineers she thinks would make good tec...
Apr 09, 2021•1 hr 2 min•Ep 9•Transcript available on Metacast Charity Majors (https://twitter.com/mipsytipsy) is the co-founder and CTO of Honeycomb.io. Before this she worked at Facebook, Parse and Linden Lab on infrastructure and developer tools, and always seemed to wind up running the databases. She is the co-author of Database Reliability Engineering book and also has an amazing blog at charity.wtf. We love the content in her blogs and have learned a lot from them. We had a lot of fun speaking with Charity in this lively conversation! We learned about...
Mar 20, 2021•1 hr 7 min•Ep 8•Transcript available on Metacast Tammy Bryant Butow is a Principal SRE at Gremlin where she works on Chaos Engineering. In this episode, we discuss how her curiosity led her to the world of infrastructure engineering, an outage from her early days where a core switch took down half the datacenter, her experience running a disaster recovery test and how it taught her about the importance of injecting failures into a system to make it more resilient. We also touch on advanced failure injection techniques, how chaos engineering is...
Mar 07, 2021•1 hr 4 min•Ep 7•Transcript available on Metacast Oliver Leaver-Smith, better known as Ols, is a Senior Devops Engineer at Sky Betting and Gaming. In this episode, we discuss how a seemingly simple monitoring change ended up taking down the entire site. We also talk about chaos and resilience engineering. We discuss how the team at Sky Betting and Gaming conducts fire drills (chaos engineering exercises) where they not only test the resiliency of their software systems but also their people systems. We walk through a recent example of a fire dr...
Feb 19, 2021•1 hr 1 min•Ep 6•Transcript available on Metacast Ryan Underwood is a Staff SRE and tech lead on the Helix and Zookeeper SRE team at LinkedIn. Prior to LinkedIn, he was an SRE at Machine Zone and Google. Apart from his regular responsibilities, Ryans interest and expertise include debugging production kernel, I/O and containerization issues. His opinion about not treating software as a black box and his persistent approach to debugging complex problems are truly inspiring. On several occasions, Ryans colleagues have leaned on him to solve an es...
Feb 06, 2021•1 hr 3 min•Ep 5•Transcript available on Metacast David is LinkedIns former SVP of Engineering and Operations. He came out of retirement to join LinkedIn in 2009 during a time of rapid growth. After 4 years at LinkedIn, he retired in 2013. Throughout his career, David has been in multiple leadership positions and has been recognized as one of the best Operations Executives. This was an extremely fascinating conversation. David shares insightful stories from early days at LinkedIn and what it took to develop the culture of Site Up and Secure. He...
Jan 23, 2021•58 min•Ep 4•Transcript available on Metacast In this episode, we speak with Julia Evans. Julia runs a programming zines business, called Wizard Zines ( https://wizardzines.com/ ), where she creates comics about various programming concepts. She has been creating zines, when she was still a software engineer at Stripe. Her zines are extremely approachable and highly educational. In addition to creating zines, Julia is a prolific blogger and has around 500 posts on her blog at jvns.ca. Her blogs are another great source to learn about fundam...
Jan 06, 2021•46 min•Ep 3•Transcript available on Metacast In this episode, we speak with Kelsey Hightower who is currently a Principal Developer Advocate at Google and one of the most influential individuals in the Kubernetes community. He is also an author and a keynote speaker, with a knack for demystifying complex topics, doing live demos and enabling others to succeed. In this insightful conversation, we cover wide ranging topics from his role at Google to the art of storytelling. We get into some very interesting details of how Kubernetes can brea...
Dec 04, 2020•1 hr 1 min•Ep 2•Transcript available on Metacast In this episode, Ronak, Austin and Guangshare the origin story - who they are, what this podcast is about and why they are doing this. They've seen first hand how stressful it is when something breaks in production but also found it to be the best opportunity to learn about a system more deeply. They started this podcast to have in-depth conversations with software and devops experts and hear their stories from the trenches about how software breaks in production. In upcoming conversations, they...
Nov 28, 2020•4 min•Ep 1•Transcript available on Metacast