Chris Riccomini - Building (and Writing About) Data Intensive Applications
Chris Riccomini and I chat about building his latest project SlateDB, building data intensive infrastructure, writing, investing, and much more.
Chris Riccomini and I chat about building his latest project SlateDB, building data intensive infrastructure, writing, investing, and much more.
In this episode, I have a chat with Antti Rask, Juha Korpella, Niko Korvenlaita, Russell Willis, and Kosti Hokkanen. We chat about data, startups, and business in Finland and Europe.
Let's do things the right way, not just the fast way. My works: 📕Fundamentals of Data Engineering: https://www.oreilly.com/library/view/fundamentals-of-data/9781098108298/ 🎥 Deeplearning.ai Data Engineering Certificate: https://www.coursera.org/professional-certificates/data-engineering 🔥Practical Data Modeling: https://practicaldatamodeling.substack.com/ 🤓 My SubStack: https://joereis.substack.com/
I speak at a lot of conferences, and I've lost track of how many questions I've answered. Since conferences are top of mind for me right now, here are some tips for asking good (and bad) questions of speakers. My works: 📕Fundamentals of Data Engineering: https://www.oreilly.com/library/view/fundamentals-of-data/9781098108298/ 🎥 Deeplearning.ai Data Engineering Certificate: https://www.coursera.org/professional-certificates/data-engineering 🔥Practical Data Modeling: https://practicaldatamodeli...
Wes McKinney and I chat about Positron, Arrow, how he created Pandas and Arrow, and what makes him tick.
I've seen a TON of horror stories with tech debt and code migrations. It's estimated that 15% to 60% of every dollar in IT spend goes toward tech debt (that's a big range, I know). Regardless, most of this tech debt will not be paid down without a radical change in how we do things. Might AI be the Hail Mary we need to pay down tech debt? I don't see why not... My works: 📕Fundamentals of Data Engineering: https://www.oreilly.com/library/view/fundamentals-of-data/9781098108298/ 🎥 Deeplearning.a...
Anne-Claire Baschet and Yoann Benoit recently wrote a wonderful article called The Data Death Cycle, which describes the feedback loop of doom that many data teams find themselves in. Here, we discuss the Data Death Cycle in detail. Article: https://medium.com/craftingdataproducts/the-data-death-cycle-6b10ef261d8e
Larry Burns and I chat about all things data teams—how they fail, their challenges, and how they can add value. To add value, we need to reimagine not only how we think about data but also how we manage knowledge. Larry brings a fresh and battle-worn perspective to the data field, and if you work on or manage a data team, this conversation is worth a listen. LinkedIn: https://www.linkedin.com/in/larryburnsdba/
This week I posted about how some major conferences charge a bunch of money for tickets and sponsorship, but don't pay speakers. As a speaker, I find this unethical and exploitative. Here, I unpack my thoughts on speaking at conferences. If you're a speaker, or want to become one, this is worth your time to listen. My post: https://www.linkedin.com/posts/josephreis_this-morning-i-had-to-decline-a-speaking-activity-7252331326287011841-NPG6
Vijay Yadav (Director of Data Science at Merck) joins me to chat about a very interesting project he launched at Merck involving LLMs in production. A big part of this discussion is how to make data ready for generative AI. This is a great example of an LLM-native use case in production, which are rare right now. Lots to learn from here. Enjoy! LinkedIn: https://www.linkedin.com/in/vijay-yadav-ds/
In my newsletter last week , I wrote "Data’s still a mess. Most data initiatives fail. Data teams are seen as a cost center and not getting the support they deserve. Same as it ever was." Here, I unpack those four sentences. Data teams need to stop stop playing to not lose. Instead, they need to play to win!
Navnit Shukla is a solutions architect with AWS. He joins me to chat about data wrangling and architecting solutions on AWS, writing books, and much more. Navnit is also in the Coursera Data Engineering Specialization, dropping knowledge on data engineering on AWS. Check it out! Data Wrangling on AWS: https://www.amazon.com/Data-Wrangling-AWS-organize-analysis/dp/1801810907 LinkedIn: https://www.linkedin.com/in/navnitshukla/...
I've spent the last three weeks visiting the UK, Australia, and New Zealand. Here are my observations and anecdotes about the data and ML/AI industry from countless chats with executives, practitioners, and pundits.
Ilya Reznik has been in the ML game for ages, having worked at Adobe and Twitter and led teams at Meta, among others. We chat about leading ML teams, AI today, creating content, and much more. LinkedIn: https://www.linkedin.com/in/ibreznik/
As I travel this Fall, I'm reminded that most people don't work at fancy tech companies. Most people work at traditional companies with "boring" data and tech stacks. And that's OK. Boring is good.
Jordan Morrow has written a ton, including four books. We chat about the process of writing books, the ins and outs of working with a publisher, the role of AI in writing, and much more. If you're interested in writing a book, this is a crash course in what you should know. Enjoy!
Venkat Subramaniam is a programmer, author, speaker, and founder of Agile Developer, Inc. I've seen him speak several times, and was always blown away by his passion and technical depth. So, I was excited to have him on the podcast. We chat about agile development in the real world, learning to do less, and much more. Venkat is extremely wise, and I very much enjoyed our discussion. Enjoy! LinkedIn: https://www.linkedin.com/in/vsubramaniam Twitter: https://x.com/venkat_s...
Uncle Rico is a character in the movie Napoleon Dynamite, who is stuck in the past, reminiscing about his days as a high school football star. If only he'd won the game and went to the state championship. Some of the data industry reminds me of Uncle Rico. During a recent panel, there was a question about whether AI can help with data management (governance, modeling, etc). Some people were quick to dismiss this, saying that machines are no substitute for humans in their understanding and transl...
Paco Nathan is a national treasure. He's not only an OG in the field of AI, but he's also instrumental in early hacker and cyberpunk culture. When I first met Paco, it suddenly clicked that I'd seen his name in various cyberpunk and alternative zines back in the 1990s. We have a chat all sorts of crazy stuff, and I feel like we only got to 5% of the stories..
Bethany Lyons and I chat about disrupting the recruitment industry, startups, and the future of work.
Last week I talked about how good you have to be at your job. Yesterday's OpenAI announcement of it's "reasoning" model, o1, got me thinking about how good AI needs to be to do our jobs.
Ergest Xheblati is a data architect and author of Minimum Viable SQL Patterns. We chat about the opportunities and challenges of SQL, things that don't change in tech and data, writing and publishing books, and much more. LinkedIn: https://www.linkedin.com/in/ergestx/ Other links: https://www.ergestx.com/links/
Dylan Anderson is a UK-based data strategist. We chat about bridging the gap between data and strategy, why talking about business value is a waste of time, and much more. LinkedIn: https://www.linkedin.com/in/dylansjanderson/ Substack: https://thedataecosystem.substack.com/
While speaking to one of my best friends, who's worked as a pilot for over 30 years, he mentioned that a "good" pilot doesn't crash the plane. In tech and data, "good" is viewed differently. How good you have to be at your job?
Jordan Tigani is back to chat about why small data is awesome, data lakehouses, DuckDB, AI, and much more. Motherduck: https://motherduck.com/ LinkedIn: https://www.linkedin.com/in/jordantigani/ Twitter: https://twitter.com/jrdntgn?lang=en
Demetrios Brinkmann is the co-founder of the massively global MLOps Community. We chat about AI hype vs reality, building a global tech community, and ROI of AI projects, and much more. LinkedIn: https://www.linkedin.com/in/dpbrinkm/ MLOps Community: https://mlops.community/
"Do you and Zach Wilson hate each other?" I get asked questions like this, and it makes me laugh. We're good friends for the record. Most people play zero sum games, where one person wins and another loses. Questions like this got me thinking about how content creation is a positive sum game. You can consume content from many people, and this benefits everyone. Here, I unpack the differences of zero sum and positive sum games.
Vinoo Ganesh is an open source enthusiast and contributor, and a data and ML engineer. We chat about strong open source communities, LLMs and AI, and much more.
Lekhana Reddy is a data content creator focusing on mindfulness. We chat about how mindfulness in technology is key, especially given the need to maintain humanity with the rise of AI. LinkedIn: https://www.linkedin.com/in/lekhanareddy/ Instagram: https://www.instagram.com/storytellingbydata/
Until recently, Nik Suresh wrote under a mysterious blog that had several viral posts, including the famous "I Will F*cking Piledrive You If You Mention AI Again." For the longest time, he was an underground sensation, with nobody (not even his friends) knowing his identity. In this episode, we chat about his blog posts (I'm a huge fan), the realities of data science and data engineering, and much more. This is a very candid and fun chat where I'm actually the fanboy, so enjoy! Blog: https://lud...