Introducing The Show - podcast episode cover

Introducing The Show

Jan 08, 20174 min0
--:--
--:--
Download Metacast podcast app
Listen to this episode in Metacast mobile app
Don't just listen to podcasts. Learn from them with transcripts, summaries, and chapters for every episode. Skim, search, and bookmark insights. Learn more

Episode description

Preamble
  • Hello and welcome to the Data Engineering Podcast, the show about modern data infrastructure
  • Go to dataengineeringpodcast.com to subscribe to the show, sign up for the newsletter, read the show notes, and get in touch.
  • You can help support the show by checking out the Patreon page which is linked from the site.
  • To help other people find the show you can leave a review on iTunes, or Google Play Music, share it on social media, and tell your friends and co-workers.
  • I’m your host, Tobias Macey, and today I’m speaking with Maxime Beauchemin about what it means to be a data engineer.
Interview
  • Who am I
  • Systems administrator and software engineer, now DevOps, focus on automation
  • Host of Podcast.__init__
  • How did I get involved in data management
  • Why am I starting a podcast about Data Engineering
  • Interesting area with a lot of activity
  • Not currently any shows focused on data engineering
  • What kinds of topics do I want to cover
  • Data stores
  • Pipelines
  • Tooling
  • Automation
  • Monitoring
  • Testing
  • Best practices
  • Common challenges
  • Defining the role/job hunting
  • Relationship with data engineers/data analysts
  • Get in touch and subscribe
  • Website
  • Newsletter
  • Twitter
  • Email

The intro and outro music is from The Hug by The Freak Fandango Orchestra / CC BY-SA

Support Data Engineering Podcast

Transcript

Tobias MaceyTobias Macey

Hello, and welcome to the Data Engineering Podcast, the show about modern data infrastructure. You can go to dataengineeringpodcast.com to subscribe to the show, sign up for the newsletter, read the show notes, and get in touch. To help other people find the show, you can leave a review on Itunes or Google Play Music and share it on social media and tell your friends and coworkers.

I'm your host, Tobias Macy, and I'd like to introduce a new podcast that I'm starting called the Data Engineering Podcast. A little bit of background about myself. I started off as a systems administrator and moved up through to being a full time software engineer for a while. Now I'm working across those boundaries as a DevOps engineer, and I have a strong focus on automation and all things. I've also done a fair bit of data engineering myself.

I also currently host another podcast called podcast.net, which is about the Python programming language and its community. So I first got involved in data management through my work as a systems administrator of just managing databases and data pipelines through various different systems. And in my next role, I did a fair bit of data analysis and data management for a company that

received a lot of data from a distributed sensor network. And so debugging the issues with those sensors and making sure that there is adequate reporting data available to be able to report on the accuracy of those sensors, etcetera. I've had a interest in the data management and data engineering space for a while as well as data science.

And I've noticed that there are a lot of different podcasts available for data science as a discipline, but there aren't any that are dedicated to data engineering and data infrastructure, which is a similar situation that I found myself in with Python where I was a big fan of the Python programming language and its community, but there weren't any podcasts that focused on that particular market. So in this case, I decided to start a new podcast about data engineering.

And it's an interesting area that's got a lot of activity, and there's a lot been a lot of growth in the idea of data engineering and data infrastructure. And so as far as the kind of subject area that I wanna cover with this podcast, I'm interested in covering things like databases and data storage, data pipelines and management, some of the different tooling that is useful and important for the discipline of data engineering.

I also like to focus on automation both in terms of data automation as far as collecting and aggregating it and processing it, but also automation of the underlying infrastructure, monitoring of data pipelines so that you can be sure that the proper processes are taking place as well as testing of those data pipelines so that you can be sure

as you roll out changes that you're not gonna end up with bad data at the end of it. I also wanna cover some of the different best practices that are evolving in the space, some of the common challenges that are faced by practitioners, and also working on trying to help define what it actually means to be a data engineer and some of the challenges associated with finding jobs in that particular space because it is such an evolving area of technology.

Another subject that I think would be interesting to cover is some of the relationship with data engineers and data scientists and what that working relationship looks like. So if this is something that you're interested in, then I suggest you subscribe to the feed because there are gonna be a number of interesting episodes. I've already been speaking with a few different people, including some of the folks from Packarderm, Maxime Beauchmann from Airbnb.

I've got an episode about the Dask distributed environment coming up. So go to the site, subscribe to the show, sign up for the newsletter, follow me on Twitter, or send me an email. All those links are gonna be on the site. So thanks for tuning in, and I look forward to having you back soon.

Transcript source: Provided by creator in RSS feed: download file
For the best experience, listen in Metacast app for iOS or Android