Hey friends, it's been a while. I haven't been on-call, but I have been working on meeting tons of new people for new content for this podcast. I can't do it alone though. Would you like to be on the podcast? Reach out! Twitter: https://twitter.com/OnCallNightmare Email: [email protected] The commitment for your story is under 35 minutes and you'll have a lasting testimony of your experience on-call.
May 08, 2020•1 min
Well 2019 is just about done, that means one more podcast. This time I break format a bit and welcome on Corey Quinn. Corey and I take a look at how he founded the company and how they help people save money on their AWS bills. Then Corey and I take a dive into some of the topics that impacted the cloud in 2019. A fun conversation to end 2019! Corey is the Cloud Economist at The Duckbill Group. Corey specializes in helping companies improve their AWS bills by making them smaller and less horrify...
Dec 23, 2019•36 min
It's the One Year Anniversary of On-Call Nightmares. When I set out to start this podcast, there were a few people on a list that i just felt I needed to speak to. I finally checked off the first name I had on the list. Episode 45 is a conversation with Google Principal Developer Advocate, Kelsey Hightower. Kelsey Hightower is a Technologist working at Google while learning in public. https://twitter.com/kelseyhightower https://github.com/kelseyhightower/kubernetes-the-hard-way
Dec 19, 2019•34 min
This week I chat with Silvia Botros also known as the @dbsmasher from Twitter. I learn about her experiences on-call for databases, motherhood and an affinity for breaking things. An awesome conversation with an incredible person. Silvia Botros is a Sr Principal Engineer at Twilio. She focuses on ways to break databases but is also talented at finding bugs in all your software. Whether she helped build it or not. When she is not helping Twilio Sendgrid send billions of emails a day, she is busy ...
Nov 21, 2019•34 min
One of the best parts of attending DOES 2019 in Las Vegas was meeting so many of the leaders and innovators from the world of DevOps. Damon Edwards's work is extremely well known in the DevOps field and I was lucky enough to discuss his history during this interview. Damon Edwards is a Co-Founder of Rundeck Inc., the makers of Rundeck, the popular open source Operations Management Platform. Damon has spent over 15 years working with both the technology and business ends of IT Operations and is n...
Nov 14, 2019•31 min
The number 42 has a huge meaning for baseball fans. Jackie Robinson wore 42, Mariano Rivera wore 42 and now one of the greatest in DevOps, John Willis wears the On-Call Nightmares podcast episode #42! Learn from John's past, his present and his future at Red Hat. We got together at the 2019 DevOps Enterprise Summit in Las Vegas to chat about all things DevOps and a lil Yankees baseball (not much). By far one of the most important episodes of the podcast yet. John Willis has worked in the IT mana...
Oct 31, 2019•36 min
On-Call Nightmares returns to talk to the man from Texas who represents Big Blue, JJ Asghar. JJ and I discuss his start as a 15-year-old in technology and how on-call has morphed over the years. JJ works at IBM on the IBM cloud as a Developer Advocate. He’s focusing on the IBM Kubernetes Service trying to make companies and users have a successful on boarding to the Cloud Native ecosystem. He lives and grew up in Austin, Texas. He enjoys a good strong stout, hoppy IPA, and some team building Art...
Oct 24, 2019•42 min
A big milestone, episode 40! This week I speak with Netflix SRE Ryan Kitchen about birds, DR and movies! Ryan Kitchens has been in a variety of positions in software over the past ten years allowing him to experience the good and the bad, the amazing and the bizarre. As an SRE with a film degree, he currently works at Netflix on the CORE team, focused on ensuring availability. The background of the team spans incident management and analysis, resilience engineering, and human factors & syste...
Oct 10, 2019•34 min
This week I speak with Dan Bentley of tilt.dev! Dan is a software engineer who's currently fixing microservice development as CEO of Tilt ( https://tilt.dev ). Before that, he was at Google for 11 years and then Twitter, working on tools for devs and tools for non-developers. He's opened for The Who and has checks from Donald Knuth. Transcript: https://aka.ms/AA64hk6 https://tilt.dev https://twitter.com/dbentley
Sep 25, 2019•33 min
Live from DevOpsDays Portland, I speak with Gene Kim, Author of "The Phoenix Project" and the upcoming book "The Unicorn Project." When I started this podcast, one of my goals was to talk to Gene about his own experiences in IT, thankfully this trip to DevOpsDays in PDX helped that happen. Cameos by Jennifer Davis, Matty Stratton, Jason Yee and Terri Haber! Gene Kim is a multiple award-winning CTO, researcher and author, and has been studying high-performing technology organizations since 1999. ...
Sep 12, 2019•36 min
The On-Call Nightmares Listener feedback system works! Without your stories I just cannot do this podcast. Thankfully, Jason Schuster reached out to share his experience in a 20 year career in technology. Share in his nightmare on this latest episode! Transcript: https://aka.ms/AA606at Jason's Bio: After graduating with a BFA in theater design in 2000 I landed my first job admiring HPUX servers. I took a low ball salary in exchange for training. While I got the training it took a long while for ...
Sep 05, 2019•28 min
Live from DevOpsDays Chicago! I meet up with Ops Veteran, Michael Stahnke as we discuss his career in technology. From the weird days of AIX systems all the way till his time now at CricleCI, Michael has plenty of great stories. Special cameos by Jason Yee and Joshua Zimmerman (our laugh track). Michael Stahnke is VP of Platform Engineering at CircleCI. Prior to this role, he was at Puppet running engineering for Puppet Enterprise, Puppet Open source, and SRE. He is an author for State of DevOps...
Aug 29, 2019•31 min
Getting paid is a pretty dang important part of your job. Mike Grayson and the team at Paychex are working to make sure that the databases that handle that are always online. This week I catch up with Mike Grayson who's been a great advocate for the database ops community. Mike is a Senior Database Engineer specializing in DevOps, MongoDB, and Apache Kafka based out of Rochester, New York. He is a MongoDB Master and speaker in the Oracle, SQL Server and MongoDB communities. Transcript: https://a...
Aug 22, 2019•30 sec
X gonna give it to ya! Xander from the Microsoft Azure Kubernetes SRE Team joins me to talk about his history on-call and more! Xander is a Site Reliability Engineer at Microsoft, he currently slings containers on Azure Kubernetes Service. Previous to Microsoft, he did all the things with retail tech at both Starbucks and Target. You are always welcome to send him your favorite cat pictures. @XanderGrzy https://github.com/salaxander Full Transcript: https://aka.ms/AA5r8ja
Aug 08, 2019•36 min
On-call can come in different shapes and sizes. Sometimes it's a group of developers who are attacking a problem to keep other developers afloat. That's what Ben Halpern and the team at the DEV Community are up to. Founder of DEV, Canadian, generalist software developer who writes a lot of Ruby. Transcript: https://aka.ms/AA5r8ja https://dev.to/ben https://twitter.com/bendhalpern
Aug 01, 2019•42 min
This week I speak with my friend Matty Stratton as we discuss the hard times and the processes to make them better. Matty Stratton is a DevOps Advocate at PagerDuty, where he helps dev and ops teams advance the practice of their craft and become more operationally mature. He collaborates with PagerDuty customers and industry thought leaders in the broader DevOps community, and back when he drove, his license plate actually said “DevOps”. Matty has over 20 years experience in IT operations, rangi...
Jul 25, 2019•50 min
Datadog Dash was this week which meant I was lucky enough to catch up with my friend, Jason Yee. We discuss his time in tech, measuring everything and a lot more! Jason is a technical evangelist at Datadog, where he works to inspire developers and ops engineers with the power of metrics and monitoring. Previously, he was the community manager for DevOps & Performance at O'Reilly Media and a software engineer at MongoDB. He's currently exploring the world while living as a nomad and would lov...
Jul 18, 2019•34 min
Episode 30 is a waterfall of information you'll soak up and learn a ton from. Things get a bit wet and wild for Tim in this episode of On-Call Nightmares! A great discussion about a long history in tech, the things you just can't plan for and more. Tim is an engineering manager at InfluxData with over 20 years of experience. His technical interests include high-performance, scalable, fault-tolerant cloud infrastructure, interconnected hybrid architecture, containerization (c14n?) all the way dow...
Jul 11, 2019•42 min
This week's conversation is with Molly Struve of Kenna Security! We discuss her path to tech, how her team worked to fix their on-call rotation and more! Molly Struve is the Lead Site Reliability Engineer at Kenna Security. She joined Kenna in 2015 and has had the opportunity to work on some of the most challenging aspects of Kenna’s code base. This includes scaling Elasticsearch, sharding MySQL databases, and creating an infrastructure that can grow as fast as Kenna's business. When not making ...
Jul 03, 2019•34 min
This week my homie supreme, Jason Hand joins me on On-Call Nightmares. We talk monitoring, SRE and getting in the van. Jason has spent the last 5 years connecting with technologists around the world on ideas related to balancing system and service reliability with the speed and agility required in today's digital world. Previously at VictorOps, Jason authored four books on the subjects of Site Reliability Engineering, Post-Incident Reviews, and ChatOps and was named "DevOps Evangelist of the Yea...
Jun 27, 2019•45 min
This week, I bring a friend from a past job to share his insights on observability and other aspects of a weird life in technology. This is one of my favorite chats because Joe is one of my favorite people in tech. "Customer-concerned Operations and Systems workers turned Cloud Native lab-rat at Packet, previously of DigitalOcean, IBM, Recurly, Platform9 Systems. Approach to Production engineering relies on an iterative combination of programmatically-led audits, collaborative remediation, and m...
Jun 13, 2019•40 min
This week I speak with Jacquie of MedStack! We get insights into how her career started including a nightmare where she's thrown right into the fire. Jacquie has worked in FinTech, media, and is currently in eHealth working at MedStack, a digital app platform for the healthcare industry. She's passionate about solving problems with a holistic approach, and bridging the gaps in communication and systems. Building something meaningful is important to her – from making healthcare accessible to crea...
Jun 06, 2019•29 min
Live from DevOpsDays Toronto, I meet up with my fellow DevRel road warrior, Quintessence Anx of Logz.io. Quintessence bring years of experience and compassion to her role. Quintessence is a champion for mindfulness around accessibility and diversity. In her own words... I’ve worked in the IT community for over 10 years, including as a database administrator and a DevOps / Cloud / Infrastructure engineer. I was a core contributor to Stark & Wayne’s SHIELD project, which adds backup functional...
May 30, 2019•30 min
Live from ChefConf 2019, I talk with Nathen Harvey about outages, lunch and a life spent in technology. This was one of my favorite podcast interviews because Nathen is one of my major influences and mentors in what we do in Developer Advocacy and Relations in technology. He's taught me so much over the years and has done his best to check in with me during the tough moments, like another member of the on-call team might do during a rough incident. Nathen Harvey, Cloud Developer Advocate at Goog...
May 23, 2019•38 min
This week we speak with Gremlin's Community Manager, Rich Burroughs, on his time on-call. We discuss power outages, active-active datacenters and other perspectives from a long career in technology. Rich Burroughs is a Community Manager at Gremlin where he’s focused on growing and strengthening the Chaos Engineering community. He previously worked at Puppet as an SRE and in other operational roles over the years. Rich spent about twenty years of his career in oncall rotations, and is driven by h...
May 16, 2019•37 min
Bonus! ME!!! I spoke at Microsoft's community event "bits of //build" about overcoming failure. This is a culture talk I have been working on that really focuses on my personal road through failure and recovery. Thanks to all who sat in the room and took part. https://twitter.com/jaydestro
May 09, 2019•23 min
This week I get a chance to speak to someone who just wants to save you some money on your cloud bills. Mike shares some great stories and gives insight to what he and Corey Quinn are working on at the Duckbill Group. Mike is the CEO of The Duckbill Group, a consultancy helping companies fix the horrifying AWS bill by both lowering the size of it and helping them understand where the money is going. Mike also hosts the Real World DevOps Podcast, is the author of O’Reilly’s Practical Monitoring, ...
May 09, 2019•40 min
Who wakes up the people who get woken up for on-call? The folks at PagerDuty are responsible for providing pager notifications to teams across the globe. In this interview I talk with Arup Chakrabarti who's dedicated to get you your alerts. Arup has been working in the space of software operations since 2007. He started out at as an Operations Engineer at Amazon, helping to reduce customer defects with multiple teams for the Amazon Marketplace. Since then, he has managed and built operations tea...
Apr 25, 2019•39 min
LET'S GET WEIRD. LET'S GET WEIRD. LET'S GET WEIRD. This week we talk with Nick Maludy of Encore Technologies on some "weird on-prem" he managed when working as a Defense Contractor. Nick brings unique insight into having to manage critical systems from 10,000 feet above the ground. After graduating Nick Maludy worked for ~5 years at a Department of Defense contractor called SilverBlock Systems. Here they developed an "integration platform" for performing sensor research for next-gen Navy aircraf...
Apr 18, 2019•38 min
You know that little box on the lower bottom of the window you see that asks you if you need help on websites? Well Shayon is part of the team that keeps that online for businesses across the planet. We chat a bit about his time on-call and other topics. Shayon is a System Engineer at Intercom. He is part of the internal infrastructure team that is responsible for Intercom's Availability, Performance, and Scalability. Outside of regular system/infra work, he has played other roles at Intercom, a...
Apr 11, 2019•32 min