MLOps.community - podcast cover

MLOps.community

Demetrios mlops.community
Relaxed Conversations around getting AI into production, whatever shape that may come in (agentic, traditional ML, LLMs, Vibes, etc)

Episodes

Scalable Python for Everyone, Everywhere // Matthew Rocklin // MLOps Meetup #38

Parallel Computing with Dask and Coiled Python makes data science and machine learning accessible to millions of people around the world. However, historically Python hasn't handled parallel computing well, which leads to issues as researchers try to tackle problems on increasingly large datasets. Dask is an open source Python library that enables the existing Python data science stack (Numpy, Pandas, Scikit-Learn, Jupyter, ...) with parallel and distributed computing. Today Dask has been broadl...

Oct 19, 202057 minSeason 1Ep. 37

MLOps Coffee Sessions #13 How to Choose the Right Machine Learning Tool: A Conversation // Jose Navarro and Mariya Davydova

This time we talked about one of the most vibrant questions for any MLOps practitioner: how to choose the right tools for your ML team, given the huge amount of open-source and proprietary MLOps tools available on the market today. We discussed several criteria to rely on when choosing a tool, including: - The requirements of the particular team use-cases - The scaling capacity of the tool - The cost of migration from a chosen tool - The cost of teaching the team to use this tool - The company o...

Oct 18, 20201 hr 1 minSeason 1Ep. 13

MLOps Coffee Sessions #14 Conversation with the Creators of Dask // Hugo Bowne-Anderson and Matthew Rocklin

Dask What is it? Parallelism for analytics What is parallelism? Doing a lot at once by splitting tasks into smaller subtasks which can be processed in parallel (at the same time) Distributed work across multiple machines and then combining the results Helpful for CPU bound - doing a bunch of calculations on the CPU. The rate at which process progresses is limited by the speed of the CPU Concurrency? Similar but a but things don’t have to happen at the same time, they can happen asynchronously. T...

Oct 12, 202057 minSeason 1Ep. 14

MLOps Coffee Sessions #12: Journey of Flyte at Lyft and Through Open-source // Ketan Umare

Why was Flyte built at Lyft? What sorts of requirements does a ML infrastructure team have at lyft? What problems does it solve / use cases? Where does it fit in in the ML and Data ecosystem? What is the vision? Who should consider using it? Learnings as the engineering team tried to bootstrap an open-source community. Ketan Umare is a senior staff software engineer at Lyft responsible for technical direction of the Machine Learning Platform and is a founder of the Flyte project. Before Flyte he...

Oct 10, 20201 hr 5 minSeason 1Ep. 12

MLOps Coffee Sessions #11: Analyzing “Continuous Delivery and Automation Pipelines in ML" // Part 3

Round 3 analyzing the Google paper "Continuous Delivery and Automation Pipelines in ML" // Show Notes Data Science Steps for ML Data extraction: You select and integrate the relevant data from various data sources for the ML task. Data analysis: You perform exploratory data analysis (EDA) to understand the available data for building the ML model. This process leads to the following: Understanding the data schema and characteristics that are expected by the model. Identifying the data preparatio...

Oct 04, 20201 hr 6 minSeason 1Ep. 11

MLOps Meetup #36: Moving Deep Learning from Research to Prod Using DeterminedAI and Kubeflow // David Hershey, DeterminedAI

MLOps community meetup #36! This week we talk to David Hershey Solutions Engineer at Determined AI, about Moving Deep Learning from Research to Production with Determined and Kubeflow. // Key takeaways: What components are needed to do inference in ML How to structure models for ML inference How a model registry helps organize your models for easy consumption How you can set up reusable and easy-to-upgrade inference pipelines // Abstract: Translating the research that goes into creating a great ...

Oct 04, 202056 minSeason 1Ep. 36

MLOps Coffee Sessions #10 Analyzing the Article “Continuous Delivery and Automation Pipelines in Machine Learning" // Part 2

Second installation David and Demetrios reviewing the google paper about Continuous training and automated pipelines. They dive deep into machine learning monitoring and also what exactly continuous training actually entails. Some key highlights are: Automatically retraining and serving the models: When to do it? Outlier detection Drift detection Outlier detection: What is it? How you deal with it Drift detection Individual features may start to drift. This could be a bug or it could be perfectl...

Sep 22, 20201 hr 8 minSeason 1Ep. 10

MLOps Meetup #34: Streaming Machine Learning with Apache Kafka and Tiered Storage // Kai Waehner, Confluent

MLOps Meetup #34! This week we talk to Kai Waehner about the beast that is apache kafka and how many different ways you can use it! // Key takeaways: -Kafka is much more than just messaging -Kafka is the de facto standard for processing huge volumes of data at scale in real-time -Kafka and Machine Learning are complementary for various use cases (including data integration, data processing, model training, model scoring, and monitoring) // Abstract: The combination of Apache Kafka, tiered storag...

Sep 17, 202053 minSeason 1Ep. 35

MLOps Meetup #33 Owned By Statistics: How Kubeflow & MLOps Can Help Secure Your ML Workloads // David Aronchick - Head of Open Source ML Strategy at Azure

While machine learning is spreading like wildfire, very little attention has been paid to the ways that it can go wrong when moving from development to production. Even when models work perfectly, they can be attacked and/or degrade quickly if the data changes. Having a well understood MLOps process is necessary for ML security! Using Kubeflow, we demonstrated how to the common ways machine learning workflows go wrong, and how to mitigate them using MLOps pipelines to provide reproducibility, va...

Sep 14, 202056 minSeason 1Ep. 34

MLOps Coffee Sessions #9 Analyzing the Article “Continuous Delivery and Automation Pipelines in Machine Learning “ // Part 1

In this last episode, we covered how Google is thinking about MLOps and how automation plays a key part in their view of MLOps. We started to talk about CI, CD, and the role they play in a pipeline setup for CT. In the next episode, we'll pick up where we left off, starting our discussion of CT and some of the reasons you’d want to set up a pipeline with continuous training in the first place. Join our slack community: https://join.slack.com/t/mlops-community/shared_invite/zt-391hcpnl-aSwNf_X5Ry...

Sep 14, 202059 minSeason 1Ep. 9

MLOps Meetup #32 Building Say Less: An AI-Powered Summarization App // Yoav Zimmerman - Founder of Model Zoo

Yoav is the builder behind Say Less, an AI-powered email summarization tool that was recently featured on the front page of Hacker News and Product Hunt. In this talk, Yoav will walk us through the end-to-end process of building the tool, from the prototype phase to deploying the model as a realtime HTTP endpoint. Yoav Zimmerman is the engineer / founder behind Model Zoo, a machine learning deployment platform focused on ease-of-use. He has previously worked at Determined AI on large-scale deep ...

Sep 08, 202053 minSeason 1Ep. 33

MLOps Coffee Sessions #8 // MLOps from the Perspective of an SRE // Neeran Gul

|| Links Referenced in the Show || General Info: https://medium.com/@paktek123 Load Balancer Series: https://medium.com/load-balancer-series Upcoming Open Src: https://medium.com/upcoming-open-source Some Libraries Neeran maintains: https://github.com/paktek123/elasticsearch-crystal Some libraries Neeran used to maintain: https://github.com/microsoft/pgtester (and https://medium.com/yammer-engineering/testing-postgresql-scripts-with-rspec-and-pg-tester-c3c6c1679aec) Some interesting projects Nee...

Sep 08, 202058 minSeason 1Ep. 8

MLOps Meetup #31 // Creating Beautiful Ambient Music with Google Brain’s Music Transformer // Daniel Jeffries - Chief Technology Evangelist at Pachyderm

We trained a Transformer neural net on ambient music to see if a machine can compose with the great masters. Ambient is a soft, flowing, ethereal genre of music that I’ve loved for decades. There are all kinds of ambient, from white noise, to tracks that mimic the murmur of soft summer rain in a sprawling forest, but Dan favors ambient that weaves together environmental sounds and dreamy, wavelike melodies into a single, lush tapestry. Can machine learning ever hope to craft something so seeming...

Sep 05, 202056 minSeason 1Ep. 32

MLOps Coffee Sessions #7 // MLOps and DevOps - Parallels and Deviations // Featuring Damian Brady

MLOps and DevOps have a large number of parallels. Many of the techniques, practices, and processes used for traditional software projects can be followed almost exactly in ML projects. However, the day-to-day of an ML project is usually significantly different from a traditional software project. So while the ideas and principles can still apply, it’s important to be aware of the core aims of DevOps when applying them. Damian is a Cloud Advocate specializing in DevOps and MLOps. After spending ...

Aug 31, 202056 minSeason 1Ep. 7

MLOps Meetup #30 // Path to Production and Monetizing Machine Learning // Vin Vashishta - Data Scientist | Strategist | Speaker & Author

The concept of machine learning products is a new one for the business world. There is a lack of clarity around key elements: Product Roadmaps and Planning, the Machine Learning Lifecycle, Project and Product Management, Release Management, and Maintenance. In this talk, we covered a framework specific to Machine Learning products. We discussed the improvements businesses can expect to see from a repeatable process. We also covered the concept of monetization and integrating machine learning int...

Aug 20, 202057 minSeason 1Ep. 31

MLOps Meetup #29 // Scaling Machine Learning Capabilities in Large Organizations // Bertjan Broeksema & Axel Goblet

Machine learning has become an increasingly important means for organizations to extract value from their data. Many companies start off with successful proofs of value but face problems when scaling their capabilities afterward. By generalizing engineering problems and solving them centrally, scaling becomes much more feasible. Model serving platforms generalize the problem of turning a machine learning model in a value-generating application. Combining a serving platform with cultural shifts s...

Aug 10, 20201 hr 3 minSeason 1Ep. 30

MLOps Coffee Sessions #6 // Continuous Integration for ML // Featuring Elle O'Brien

David & Elle talk about how one of the staples of DevOps, the practice of continuous integration, can work for machine learning. Continuous integration is a tried-and-true method for speeding up development cycles and rapidly releasing software- an area where data science and ML could use some help. Making continuous integration work for ML has been challenging in the past, and we chat about new open-source tools and approaches in the Git ecosystem for leveling up development processes with ...

Aug 08, 20201 hr 2 minSeason 1Ep. 6

MLOps Coffee Sessions #5 // Airflow in MLOps // Featuring Simon Darr and Byron Allen

Airflow is a renowned tool for data engineering. It helps with orchestrating ETL workloads and it's well regarded amongst machine learning engineers as well. So, how does Airflow work and how is it applied to MLOps? In this episode, Demetrios and David are joined by Simon Darr, a Managing Consultant at Servian, with many years of experience using Airflow, along with a Byron Allen, a Senior Consultant at Servian, specializing in ML. The group discusses how Airflow works, its pros, and cons for ML...

Aug 04, 202054 minSeason 1Ep. 5

MLOps #28 Continuous Evaluation & Model Experimentation // Danny Ma - Founder & CEO at Sydney Data Science

Most MLOps discussion traditionally focuses on model deployment, containerization, model serving - but where do the inputs come from and where do the outputs get used? In this session we demystify parts of the data science process used to create the all-important target variable and design machine learning experiments. We discuss some probability and statistical concepts which are useful for MLOps professionals. Knowledge of these concepts may assist practitioners working closely with data scien...

Jul 26, 20201 hr 1 minSeason 1Ep. 29

MLOps Coffee Sessions #4: A Conversation Around Feature Stores with Venkata Pingali and Jim Dowling

We asked what you wanted to hear next on our Coffee sessions and the vote was in favor of feature stores! Today the usual suspects Demetrios Brinkmann and David Aponte sat down to talk with Jim Dowling CEO of Logical Clocks and Venkata Pingali CEO of scribble data to talk about feature stores, what they are, why we need them, some business implications and everything in between! As always if you enjoyed the session let us know or reach out to us in slack! Check out what Jim is doing around hopsw...

Jul 25, 20201 hr 3 minSeason 1Ep. 4

MLOps #27 ML Observability // Aparna Dhinakaran - Chief Product Officer at Arize AI

As more and more machine learning models are deployed into production, it is imperative we have better observability tools to monitor, troubleshoot, and explain their decisions. In this talk, Aparna Dhinakaran, Co-Founder, CPO of Arize AI (Berkeley-based startup focused on ML Observability), will discuss the state of the commonly seen ML Production Workflow and its challenges. She will focus on the lack of model observability, its impacts, and how Arize AI can help. This talk highlights common c...

Jul 24, 202055 minSeason 1Ep. 28

MLOps Meetup #26 // How to Leverage ML Tooling Ecosystem // Mariya Davydova - Head of Product at Neu.ro

In this talk, I demonstrate an example of an ML project development and production workflows which we build on top of our proprietary core - Neu.ro - using a number of open-source and proprietary tools: Jupyter Notebooks, Tensorboard, FileBrowser, PyCharm Professional, Cookiecutter, Git, DVC, Airflow, Seldon, and Grafana. I describe how we integrate each of these tools with Neu.ro, and how we can improve these integrations. Mariya came to MLOps from a software development background. She started...

Jul 20, 202056 minSeason 1Ep. 27

MLOps Coffee Sessions #3 MLOps: Isn't That Just DevOps? // Featuring Ryan Dawson

It can be tricky to explain MLOps to colleagues and managers who are used to traditional software engineering and DevOps, let alone your gran. We have to answer the 'Isn't that just DevOps?' question clearly, otherwise the challenges of MLOps will continue to be underestimated (potentially by us as well as others). In this session we dive into what is new about MLOps and why current mainstream DevOps alone does not solve the problems. Ryan Dawson is an Engineer at Seldon and author of the articl...

Jul 16, 20201 hr 7 minEp. 3

MLOps Meetup #25 // Python and Dask: Scaling the DataFrame // Dan Gerlanc - Founder of Enplus Advisors

Python's most popular data science libraries—pandas, numpy, and scikit-learn—were designed to run on a single computer, and in some cases, using a single processor. Whether this computer is a laptop or a server with 96 cores, your compute and memory are constrained by the size of the biggest computer you have access to. In this course, you'll learn how to use Dask, a Python library for parallel and distributed computing, to bypass this constraint by scaling our compute and memory across multiple...

Jul 12, 20201 hr 29 minSeason 1Ep. 26

MLOps Meetup #23 // Monitoring the ML stack // Lina Weichbrodt

How To Monitor Machine Learning Stacks - Why Current Monitoring is Unable to Detect Serious Issues and What to Do About It with Lina Weichbrodt. Monitoring usually focusses on the “four golden signals”: latency, errors, traffic, and saturation. Machine learning services can suffer from special types of problems that are hard to detect with these signals. The talk will introduce these problems with practical examples and suggests additional metrics that can be used to detect them. A case study de...

Jul 11, 202056 minSeason 1Ep. 24

MLOps Meetup #24 // How to Become a Better Data Scientist: The Definite Guide // Alexey Grigorev

How to become a better data scientist: the definite guide with Alexey Grigorev We all know what we need to do to be good data scientists: know machine learning, be able to program, be fluent in SQL and Python. That’s enough to do our job quite well. But what does it take to be a better data scientist? The best way to grow as a data scientist is to step out of direct responsibilities and try on the hats of a product manager as well as a DevOps engineer. In particular, we should: - be pragmatic an...

Jul 10, 20201 hr 1 minSeason 1Ep. 25

MLOps #22 Feature Stores: An Essential Part of the ML Stack to Build Great Data // Kevin Stumpf - Co-Founder & CTO at Tecton

Companies are increasingly investing in Machine Learning (ML) to deliver new customer experiences and re-invent business processes. Unfortunately, the majority of operational ML projects never make it to production. The most significant blocker is the lack of infrastructure and tooling required to build production-ready data for ML. Kevin Stumpf has a long history of building data infrastructure for ML, first for Uber Michelangelo, and most recently as co-founder of Tecton. Kevin will share his ...

Jul 04, 20201 hr 6 minSeason 1Ep. 23

MLOps Meetup #21 Deep Dive on Paperspace Tooling // Misha Kutsovsky - Senior ML Architect at Paperspace

David Aponte and Misha sat down and talked in depth about what the ML tool paperspace can do. Misha Kutsovsky is a Senior Machine Learning Architect at Paperspace working on the Gradient team. He has expertise in machine learning, deep learning, distributed training, and MLOps. Previously he was on Microsoft's Windows Active Defense team building fileless malware detection software and tooling machine learning systems for Microsoft DevOps & Data Scientist teams. He holds B.S. and M.S. degree...

Jun 28, 20201 hr 7 minSeason 1Ep. 22

MLOps Meetup #18 // Nubank - Running a Fintech on ML // Caique Lima and Cristiano Breuel

Running a Fintech on Machine Learning For this meetup we sat down with Caique Lima and Cristiano Breuel Machine Learning Engineers at the Brasilian Fintech Nubank. Nubank is a Fintech providing credit and banking services to more than 20 million customers. Data science has been one of the company's pillars since the beginning, and many of its critical decisions in production are made with ML, in areas such as Credit, Fraud and Customer Service. We discussed how they develop, deploy and monitor M...

Jun 21, 202054 minSeason 1Ep. 18

MLOps Meetup #19 // DataOps and Data Versioning in ML // Dmitry Petrov

DataOps and Data version Control MLOps.community meetup #19 with the Founder and creator of DVC.org Dmitry Petrov. Data versioning and data management are core components of MLOps and any end-to-end AI platform. What challenges are related to data versioning and how to overcome these? What are the benefits of using Git and data codification as a foundation of data versioning? And how open data versioning tools can enable an open MLOps ecosystem instead of closed end-to-end ML platforms. DVC and ...

Jun 19, 20201 hr 2 minSeason 1Ep. 19
For the best experience, listen in Metacast app for iOS or Android
Open in Metacast