148: Site Reliability Engineering with Niall Murphy - podcast episode cover

148: Site Reliability Engineering with Niall Murphy

May 05, 20181 hr
--:--
--:--
Download Metacast podcast app
Listen to this episode in Metacast mobile app
Don't just listen to podcasts. Learn from them with transcripts, summaries, and chapters for every episode. Skim, search, and bookmark insights. Learn more

Episode description

In this week’s episode we are lucky to be joined by Niall Murphy to discuss the discipline of Site Reliability Engineering. We start off by speaking about how he got into computing, how the SRE role came to be and what drew him to it. From here, we highlight the position of an SRE within a company/group, what SLA’s are, the positives of having 50% operations work caps and blameless postmortems. This leads us to talk about the reasoning behind striving for 100% uptime is actually detrimental to the product, and the benefits of having an Error Budget. Finally, we discuss how the role has evolved since its inception, the Wheel of Misfortune and what drew him to contribute to the seminal SRE book.
For the best experience, listen in Metacast app for iOS or Android