Plumbers of Data Science - podcast cover

Plumbers of Data Science

Andreas Kretzlearndataengineering.com
Data Engineering is the plumbing of data science. Almost invisible, but super important and a big mess when done wrong. We talk about interesting Data Engineering trends and topics. I also train Data Engineering in my Data Engineering Academy at LearnDataEngineering.com
Last refreshed:
Follow this podcast in the Metacast mobile app to refresh it and see new episodes.
Download Metacast podcast app
Podcasts are better in Metacast mobile app
Don't just listen to podcasts. Learn from them with transcripts, summaries, and chapters for every episode. Skim, search, and bookmark insights. Learn more

Episodes

#061 Reworking My Cookbook For Data Engineering

I decided to rework the cookbook focusing more on case studies and less on explaining tools. People keep asking me for a path to become a data engineer and, let's be honest, you will never achieve that with just knowledge of the tools. Finding out how companies do data engineering on their data science platforms is way more useful. Over the next weeks we will go over each study on my YouTube channel. The stuff we talk about will then go into the cookbook too.

May 27, 201917 minSeason 2Ep. 31

#059 A Look Into The Siemens Mindsphere IoT Platform? | #059

The Internet of things is a huge deal. There are many platforms available. But, which one is actually good? Join me on a 50 minute dive into the Siemens Mindsphere online documentation. I have to say I was super unimpressed by what I found. Many limitations, unclear architecture and no pricing available? Not good!

May 27, 201954 minSeason 2Ep. 29

#057 Introducing The Plumbers Medium Publication

I have created a Medium Publication especially for us Plumbers of Data Science who work in Data Engineering and Big Data. It's called, you guessed it, Plumbers of Data Science.

May 27, 201913 minSeason 2Ep. 27

#055 Data Warehouse vs Data Lake

On this podcast I talk about data warehouses and data lakes. When do people use which? What are the pros and cons of both? Architecture examples for both and does it make sense to completely move to a data lake?

May 27, 201935 minSeason 2Ep. 25

#049 I Found A REAL Use For Blockchain, At Least I thought So

After all the BS solutions using Blockchain I thought I finally found one that makes sense. Of all the possibilities it's the EU data protection law GDPR. Well, one problem I overlooked in this podcast is, that it is impossible to delete data after it is in the chain. That's however a rule for GDPR. So, I was wrong. Again :D

May 27, 201910 minSeason 2Ep. 19

#047 The Truth About Data Science Salary For Graduates

In this episode I show you how much data science graduates are actually payed in Germany. All over the internet you can find that Data Science salary is over 100k Dollars. Data Engineer or Data Scientist. It's way lower then that. Then I give you a few really good tips on how to choose the right company to work for. Huge corporation, startup or small company? Here's how to choose.

May 27, 201917 minSeason 2Ep. 17

#045 Why I Use LaTeX to Write Professionally And You Should Too

What is the best editing tool to write a thesis, a dissertation or a paper? NOT Word or Pages! It's LaTeX. In today's video I show you why I decided to use LaTeX to write my data engineering cookbook. I used it before for my diploma thesis and I am in love again :) Here's the link to the cheatsheet: https://wch.github.io/latexsheet/latexsheet.pdf Check out my Patreon for the Data Engineering Cookbook: http://bit.ly/PatreonAndreasKretz Music: "Day One" by Declan DP https://soundcloud.com/declandp...

Dec 07, 201813 minSeason 2Ep. 15

#044 How to Increase Your Chances for Internships or a Full-time Job

You have certifications or a university degree, but can't find a job? Sharing your ideas and knowledge will increase your chances! Here's how you can do that. Music: "Day One" by Declan DP https://soundcloud.com/declandp Attribution 3.0 Unported https://creativecommons.org/licenses/by/3.0/

Nov 27, 201812 minSeason 2Ep. 14

#041 Agile Development Is Important But Please Don't Do Scrum

I love agile development. People keep telling you to do Scrum, like it's the only and best choice to be agile. It's not. Here's my take on scrum and my four main beefs with it. Watch out for these issues if you are doing scrum.

Oct 18, 201819 minSeason 2Ep. 13

#040 Huge Big Data News! Cloudera and Hortonworks Merge

So, Cloudera and Hortonworks merge... In today's Plumbers of Data Science Podcast I talk about what these, big data vendors do. How they enable companies, admins and developers to do data science and many more things. If you are interested in the whole hadoop ecosystem you need to check out this episode. You won't regret it ;)

Oct 09, 201824 minSeason 2Ep. 12

#039 Is ETL Dead For Data Science and Big Data?

Is ETL dead in Data Science and Big Data? In today's podcast I share with you my views on your questions regarding ETL (extract, transform, load). Data Lakes & Data Warehouse where is the difference? Is ETL still practiced or did pre processing & cleansing replace it What would replace ETL in Data Engineering? How to become a data engineer? (check out my facebook note) How to get experience training at home? Real time analytics with RDBMS or HDFS?

Oct 03, 201829 minSeason 2Ep. 11

#38 Morning advice to beginner Data Scientists and Data Engineers

What's the difference between Data Scientists & Data Analysts? What to do to find internships or a full time job? Data Scientist and Engineer in large and small companies where's the difference? Are Data Engineers generalists or specialists? Just some questions I go over in this podcast. You sent me over 100 Questions so, I finally worked up the guts to start with the Q&A videos. Answering your questions one by one. Turns out it's a lot of fun :)

Sep 27, 201837 minSeason 2Ep. 10

#037 How To Boost Teamwork With Version Control

Without the proper tools and techniques of version control the team's efficiency goes down the drain. In this episode I talk about how tools like Jira enable you to collect bugs, future features or change requests. How they enable you to create and organize versions, add items to a version and assign items to developers. Once this is done, the team can efficiently start coding with the help of source code management systems like GitHub. How does all that work? Check out this episode to find out ...

Sep 12, 201817 minSeason 2Ep. 9

#036 Why Distributed Processing Is Super Important

You need to become comfortable with distributed processing. Data Science or the Internet of Things, the amount of data that is getting produced and processed grows like crazy. In this podcast I talk about how a platform for distributed processing looks like. I talk about the different layers that need parallelization, as well as the tools you can use for on premise installations or clouds like AWS, Azure or Google Cloud. Big Data tools like Kafka, Spark or server less like Kinesis or Lambda func...

Sep 10, 201823 minSeason 2Ep. 8

#035 Learning By Doing Is The Best Thing Ever!

For me, school and university was hard. The lectures, sitting down and getting told how things work. Reading books and learning dry stuff was a drag. I was never good at writing tests. Some people excel at this. I was often envious. Over the years I found out what my problem is. I learn differently. I am a learning by doing guy. What does that means and how am I dealing with it? Check out this episode. Maybe you have the same problem.

Sep 06, 201812 minSeason 2Ep. 7

#034 Talent Stacks For Data Engineers

Becoming an expert in single skill is not the way to go for a data engineer. In this episode I talk about which talents go good together in terms of technical and personal ones. So, that you build up a stack of knowledge that will make you a great data engineer.

Sep 04, 201823 minSeason 2Ep. 6

#033 How APIs Rule The World

Strong APIs make a good platform. In this episode I talk about why you need APIs and why Twitter is a great example. Especially JSON APIs are my personal favorite. Because JSON is also important in the Big Data world, for instance in log analytics. How? Check out this episode!

Sep 03, 201836 minSeason 2Ep. 5
For the best experience, listen in Metacast app for iOS or Android