AI Engineering Podcast - podcast cover

AI Engineering Podcast

This show is your guidebook to building scalable and maintainable AI systems. You will learn how to architect AI applications, apply AI to your work, and the considerations involved in building or customizing new models. Everything that you need to know to deliver real impact and value with machine learning and artificial intelligence.

Episodes

The Role Of Model Development In Machine Learning Systems

Summary The focus of machine learning projects has long been the model that is built in the process. As AI powered applications grow in popularity and power, the model is just the beginning. In this episode Josh Tobin shares his experience from his time as a machine learning researcher up to his current work as a founder at Gantry, and the shift in focus from model development to machine learning systems. Announcements Hello and welcome to the Machine Learning Podcast, the podcast about machine ...

May 29, 202347 minEp. 18

Real-Time Machine Learning Has Entered The Realm Of The Possible

Summary Machine learning models have predominantly been built and updated in a batch modality. While this is operationally simpler, it doesn't always provide the best experience or capabilities for end users of the model. Tecton has been investing in the infrastructure and workflows that enable building and updating ML models with real-time data to allow you to react to real-world events as they happen. In this episode CTO Kevin Stumpf explores they benefits of real-time machine learning and the...

Mar 09, 202335 minEp. 17

How Shopify Built A Machine Learning Platform That Encourages Experimentation

Summary Shopify uses machine learning to power multiple features in their platform. In order to reduce the amount of effort required to develop and deploy models they have invested in building an opinionated platform for their engineers. They have gone through multiple iterations of the platform and their most recent version is called Merlin. In this episode Isaac Vidas shares the use cases that they are optimizing for, how it integrates into the rest of their data platform, and how they have de...

Feb 02, 20231 hr 6 minEp. 16

Applying Machine Learning To The Problem Of Bad Data At Anomalo

Summary All data systems are subject to the "garbage in, garbage out" problem. For machine learning applications bad data can lead to unreliable models and unpredictable results. Anomalo is a product designed to alert on bad data by applying machine learning models to various storage and processing systems. In this episode Jeremy Stanley discusses the various challenges that are involved in building useful and reliable machine learning models with unreliable data and the interesting problems tha...

Jan 24, 202359 minEp. 15

Build More Reliable Machine Learning Systems With The Dagster Orchestration Engine

Summary Building a machine learning model one time can be done in an ad-hoc manner, but if you ever want to update it and serve it in production you need a way of repeating a complex sequence of operations. Dagster is an orchestration engine that understands the data that it is manipulating so that you can move beyond coarse task-based representations of your dependencies. In this episode Sandy Ryza explains how his background in machine learning has informed his work on the Dagster project and ...

Dec 02, 202246 minEp. 14

Solve The Cold Start Problem For Machine Learning By Letting Humans Teach The Computer With Aitomatic

Summary Machine learning is a data-hungry approach to problem solving. Unfortunately, there are a number of problems that would benefit from the automation provided by artificial intelligence capabilities that don’t come with troves of data to build from. Christopher Nguyen and his team at Aitomatic are working to address the "cold start" problem for ML by letting humans generate models by sharing their expertise through natural language. In this episode he explains how that works, the various w...

Sep 28, 202252 minEp. 13

Convert Your Unstructured Data To Embedding Vectors For More Efficient Machine Learning With Towhee

Summary Data is one of the core ingredients for machine learning, but the format in which it is understandable to humans is not a useful representation for models. Embedding vectors are a way to structure data in a way that is native to how models interpret and manipulate information. In this episode Frank Liu shares how the Towhee library simplifies the work of translating your unstructured data assets (e.g. images, audio, video, etc.) into embeddings that you can use efficiently for machine le...

Sep 21, 202252 minEp. 12

Shedding Light On Silent Model Failures With NannyML

Summary Because machine learning models are constantly interacting with inputs from the real world they are subject to a wide variety of failures. The most commonly discussed error condition is concept drift, but there are numerous other ways that things can go wrong. In this episode Wojtek Kuberski explains how NannyML is designed to compare the predicted performance of your model against its actual behavior to identify silent failures and provide context to allow you to determine whether and h...

Sep 14, 20221 hr 3 minEp. 11

How To Design And Build Machine Learning Systems For Reasonable Scale

Summary Using machine learning in production requires a sophisticated set of cooperating technologies. A majority of resources that are available for understanding how to design and operate these platforms are focused on either simple examples that don’t scale, or over-engineered technologies designed for the massive scale of big tech companies. In this episode Jacopo Tagliabue shares his vision for "ML at reasonable scale" and how you can adopt these patterns for building your own platforms. An...

Sep 10, 202254 minEp. 10

Building A Business Powered By Machine Learning At Assembly AI

Summary The increasing sophistication of machine learning has enabled dramatic transformations of businesses and introduced new product categories. At Assembly AI they are offering advanced speech recognition and natural language models as an API service. In this episode founder Dylan Fox discusses the unique challenges of building a business with machine learning as the core product. Announcements Hello and welcome to the Machine Learning Podcast, the podcast about machine learning and how to b...

Sep 09, 202259 minEp. 9

Update Your Model's View Of The World In Real Time With Streaming Machine Learning Using River

Summary The majority of machine learning projects that you read about or work on are built around batch processes. The model is trained, and then validated, and then deployed, with each step being a discrete and isolated task. Unfortunately, the real world is rarely static, leading to concept drift and model failures. River is a framework for building streaming machine learning projects that can constantly adapt to new information. In this episode Max Halford explains how the project works, why ...

Aug 26, 20221 hr 15 minEp. 8

Using AI To Transform Your Business Without The Headache Using Graft

Summary Machine learning is a transformative tool for the organizations that can take advantage of it. While the frameworks and platforms for building machine learning applications are becoming more powerful and broadly available, there is still a significant investment of time, money, and talent required to take full advantage of it. In order to reduce that barrier further Adam Oliner and Brian Calvert, along with their other co-founders, started Graft. In this episode Adam and Brian explain ho...

Aug 16, 20221 hr 8 minEp. 7

Accelerate Development And Delivery Of Your Machine Learning Projects With A Comprehensive Feature Platform

Summary In order for a machine learning model to build connections and context across the data that is fed into it the raw data needs to be engineered into semantic features. This is a process that can be tedious and full of toil, requiring constant upkeep and often leading to rework across projects and teams. In order to reduce the amount of wasted effort and speed up experimentation and training iterations a new generation of services are being developed. Tecton first built a feature store to ...

Aug 06, 202251 minEp. 6

Build Better Models Through Data Centric Machine Learning Development With Snorkel AI

Summary Machine learning is a data hungry activity, and the quality of the resulting model is highly dependent on the quality of the inputs that it receives. Generating sufficient quantities of high quality labeled data is an expensive and time consuming process. In order to reduce that time and cost Alex Ratner and his team at Snorkel AI have built a system for powering data-centric machine learning development. In this episode he explains how the Snorkel platform allows domain experts to creat...

Jul 29, 202254 minEp. 5

Declarative Machine Learning For High Performance Deep Learning Models With Predibase

Summary Deep learning is a revolutionary category of machine learning that accelerates our ability to build powerful inference models. Along with that power comes a great deal of complexity in determining what neural architectures are best suited to a given task, engineering features, scaling computation, etc. Predibase is building on the successes of the Ludwig framework for declarative deep learning and Horovod for horizontally distributing model training. In this episode CTO and co-founder of...

Jul 21, 20221 hrEp. 4

Stop Feeding Garbage Data To Your ML Models, Clean It Up With Galileo

Summary Machine learning is a force multiplier that can generate an outsized impact on your organization. Unfortunately, if you are feeding your ML model garbage data, then you will get orders of magnitude more garbage out of it. The team behind Galileo experienced that pain for themselves and have set out to make data management and cleaning for machine learning a first class concern in your workflow. In this episode Vikram Chatterji shares the story of how Galileo got started and how you can u...

Jul 14, 202247 minEp. 3

Build Better Machine Learning Models With Confidence By Adding Validation With Deepchecks

Summary Machine learning has the potential to transform industries and revolutionize business capabilities, but only if the models are reliable and robust. Because of the fundamental probabilistic nature of machine learning techniques it can be challenging to test and validate the generated models. The team at Deepchecks understands the widespread need to easily and repeatably check and verify the outputs of machine learning models and the complexity involved in making it a reality. In this epis...

Jul 06, 202249 minEp. 2

Build A Full Stack ML Powered App In An Afternoon With Baseten

Summary Building an ML model is getting easier than ever, but it is still a challenge to get that model in front of the people that you built it for. Baseten is a platform that helps you quickly generate a full stack application powered by your model. You can easily create a web interface and APIs powered by the model you created, or a pre-trained model from their library. In this episode Tuhin Srivastava, co-founder of Basten, explains how the platform empowers data scientists and ML engineers ...

Jun 29, 202246 minEp. 1

Introducing The Show

Hello, and welcome to the Machine Learning Podcast. I’m your host, Tobias Macey. You might know me from the Data Engineering Podcast or the Python Podcast.__init__ . If you work with machine learning and AI, or you’re curious about it and want to learn more, then this show is for you. We’ll go beyond the esoteric research and flashy headlines and find out how machine learning is making an impact on the world and creating value for business. Along the way we’ll be joined by the researchers, engin...

Jun 03, 20221 min