The concept of machine learning products is a new one for the business world. There is a lack of clarity around key elements: Product Roadmaps and Planning, the Machine Learning Lifecycle, Project and Product Management, Release Management, and Maintenance. In this talk, we covered a framework specific to Machine Learning products. We discussed the improvements businesses can expect to see from a repeatable process. We also covered the concept of monetization and integrating machine learning int...
Aug 20, 2020•57 min•Season 1Ep. 31
Machine learning has become an increasingly important means for organizations to extract value from their data. Many companies start off with successful proofs of value but face problems when scaling their capabilities afterward. By generalizing engineering problems and solving them centrally, scaling becomes much more feasible. Model serving platforms generalize the problem of turning a machine learning model in a value-generating application. Combining a serving platform with cultural shifts s...
Aug 10, 2020•1 hr 3 min•Season 1Ep. 30
David & Elle talk about how one of the staples of DevOps, the practice of continuous integration, can work for machine learning. Continuous integration is a tried-and-true method for speeding up development cycles and rapidly releasing software- an area where data science and ML could use some help. Making continuous integration work for ML has been challenging in the past, and we chat about new open-source tools and approaches in the Git ecosystem for leveling up development processes with ...
Aug 08, 2020•1 hr 2 min•Season 1Ep. 6
Airflow is a renowned tool for data engineering. It helps with orchestrating ETL workloads and it's well regarded amongst machine learning engineers as well. So, how does Airflow work and how is it applied to MLOps? In this episode, Demetrios and David are joined by Simon Darr, a Managing Consultant at Servian, with many years of experience using Airflow, along with a Byron Allen, a Senior Consultant at Servian, specializing in ML. The group discusses how Airflow works, its pros, and cons for ML...
Aug 04, 2020•54 min•Season 1Ep. 5
Most MLOps discussion traditionally focuses on model deployment, containerization, model serving - but where do the inputs come from and where do the outputs get used? In this session we demystify parts of the data science process used to create the all-important target variable and design machine learning experiments. We discuss some probability and statistical concepts which are useful for MLOps professionals. Knowledge of these concepts may assist practitioners working closely with data scien...
Jul 26, 2020•1 hr 1 min•Season 1Ep. 29
We asked what you wanted to hear next on our Coffee sessions and the vote was in favor of feature stores! Today the usual suspects Demetrios Brinkmann and David Aponte sat down to talk with Jim Dowling CEO of Logical Clocks and Venkata Pingali CEO of scribble data to talk about feature stores, what they are, why we need them, some business implications and everything in between! As always if you enjoyed the session let us know or reach out to us in slack! Check out what...
Jul 25, 2020•1 hr 3 min•Season 1Ep. 4
As more and more machine learning models are deployed into production, it is imperative we have better observability tools to monitor, troubleshoot, and explain their decisions. In this talk, Aparna Dhinakaran, Co-Founder, CPO of Arize AI (Berkeley-based startup focused on ML Observability), will discuss the state of the commonly seen ML Production Workflow and its challenges. She will focus on the lack of model observability, its impacts, and how Arize AI can help. This talk highlights c...
Jul 24, 2020•55 min•Season 1Ep. 28
In this talk, I demonstrate an example of an ML project development and production workflows which we build on top of our proprietary core - Neu.ro - using a number of open-source and proprietary tools: Jupyter Notebooks, Tensorboard, FileBrowser, PyCharm Professional, Cookiecutter, Git, DVC, Airflow, Seldon, and Grafana. I describe how we integrate each of these tools with Neu.ro, and how we can improve these integrations. Mariya came to MLOps from a software development background. She started...
Jul 20, 2020•56 min•Season 1Ep. 27
It can be tricky to explain MLOps to colleagues and managers who are used to traditional software engineering and DevOps, let alone your gran. We have to answer the 'Isn't that just DevOps?' question clearly, otherwise the challenges of MLOps will continue to be underestimated (potentially by us as well as others). In this session we dive into what is new about MLOps and why current mainstream DevOps alone does not solve the problems. Ryan Dawson is an Engineer at Seldon and author of the articl...
Jul 16, 2020•1 hr 7 min•Ep. 3
Python's most popular data science libraries—pandas, numpy, and scikit-learn—were designed to run on a single computer, and in some cases, using a single processor. Whether this computer is a laptop or a server with 96 cores, your compute and memory are constrained by the size of the biggest computer you have access to. In this course, you'll learn how to use Dask, a Python library for parallel and distributed computing, to bypass this constraint by scaling our compute and memory across multiple...
Jul 12, 2020•1 hr 29 min•Season 1Ep. 26
How To Monitor Machine Learning Stacks - Why Current Monitoring is Unable to Detect Serious Issues and What to Do About It with Lina Weichbrodt. Monitoring usually focusses on the “four golden signals”: latency, errors, traffic, and saturation. Machine learning services can suffer from special types of problems that are hard to detect with these signals. The talk will introduce these problems with practical examples and suggests additional metrics that can be used to detect them. A ...
Jul 11, 2020•56 min•Season 1Ep. 24
How to become a better data scientist: the definite guide with Alexey Grigorev We all know what we need to do to be good data scientists: know machine learning, be able to program, be fluent in SQL and Python. That’s enough to do our job quite well. But what does it take to be a better data scientist? The best way to grow as a data scientist is to step out of direct responsibilities and try on the hats of a product manager as well as a DevOps engineer. In particular, we should: - be pragmatic an...
Jul 10, 2020•1 hr 1 min•Season 1Ep. 25
Companies are increasingly investing in Machine Learning (ML) to deliver new customer experiences and re-invent business processes. Unfortunately, the majority of operational ML projects never make it to production. The most significant blocker is the lack of infrastructure and tooling required to build production-ready data for ML. Kevin Stumpf has a long history of building data infrastructure for ML, first for Uber Michelangelo, and most recently as ...
Jul 04, 2020•1 hr 6 min•Season 1Ep. 23
David Aponte and Misha sat down and talked in depth about what the ML tool paperspace can do. Misha Kutsovsky is a Senior Machine Learning Architect at Paperspace working on the Gradient team. He has expertise in machine learning, deep learning, distributed training, and MLOps. Previously he was on Microsoft's Windows Active Defense team building fileless malware detection software and tooling machine learning systems for Microsoft DevOps & Data Scientist teams. He holds B.S. and M.S. degree...
Jun 28, 2020•1 hr 7 min•Season 1Ep. 22
Running a Fintech on Machine Learning For this meetup we sat down with Caique Lima and Cristiano Breuel Machine Learning Engineers at the Brasilian Fintech Nubank. Nubank is a Fintech providing credit and banking services to more than 20 million customers. Data science has been one of the company's pillars since the beginning, and many of its critical decisions in production are made with ML, in areas such as Credit, Fraud and Customer Service. We discussed how they develop, ...
Jun 21, 2020•54 min•Season 1Ep. 18
DataOps and Data version Control MLOps.community meetup #19 with the Founder and creator of DVC.org Dmitry Petrov. Data versioning and data management are core components of MLOps and any end-to-end AI platform. What challenges are related to data versioning and how to overcome these? What are the benefits of using Git and data codification as a foundation of data versioning? And how open data versioning tools can enable an open MLOps ecosystem instead of closed end-to-end ML platfo...
Jun 19, 2020•1 hr 2 min•Season 1Ep. 19
MLOps coffee sessions coming at you with our primer episode talking bout kfserving! David Aponte and Demetrios Brinkmann dive deep into what model serving is in machine learning, what different types of serving there is, what serverless means, API endpoints, streaming and batch data and a bit of coffee vs tea banter. ||Show Notes|| ML in Production is Hard Blog article by Nikki: http://veekaybee.github.io/2020/06/09/ml-in-prod/?utm_campaign=Data_Elixir&utm_sou...
Jun 13, 2020•50 min•Season 1Ep. 1
MLOps.community meetup #17 a deep dive into the open source ML framework Hermoine built on top of MLflow with Neylson Crepalde Key takeaways for attendees: MLOps problems are dealt with tools but also with processes Open-source framework Hermione can help in a lot of parts of the operations process Abstract: In Neylson's experience with Machine Learning projects, he has encountered a series of challenges regarding agile processes to build and deploy ML models in a professional coop...
Jun 11, 2020•1 hr 1 min•Season 1Ep. 17
Venture Capital in Machine Learning Startups With John Spindler CEO of Capital Enterprise. John Spindler CEO of Capital Enterprise. We talked about what trends he has been seeing within MLOps, ML companies and also how he evaluates a deal. John Spindler has over 15 years experience as an entrepreneur and business advisor/consultant and as well as being responsible for the day to day management of Capital Enterprise he is also a general partner at AI Seed, an early-stage fund that inv...
Jun 06, 2020•57 min•Season 1Ep. 16
Human In The Loop Machine Learning and how to scale it with Robert Munro. This conversation centered around the components of Human-in-the-Loop Machine Learning systems and the challenges when scaling them. Most machine learning applications learn from human examples. For example, autonomous vehicles know what a pedestrian looks like because people have spent 1000s of hours labeling “pedestrians” in videos; your smart device understands you because people have spent 1000s of hours labeling...
Jun 04, 2020•55 min•Season 1Ep. 15
The amazing Byron Allen talks to us about why MLflow and Kubeflow are not playing the same game! ML flow vs Kubeflow is more like comparing apples to oranges or as he likes to make the analogy they are both cheese but one is an all-rounder and the other a high-class delicacy. This can be quite deceiving when analyzing the two. We do a deep dive into the functionalities of both and the pros/cons they have to offer. Byron is a Senior Consultant at Servian - a data consultancy in Australia that als...
May 28, 2020•55 min•Season 1Ep. 14
Resume building and Interviewing tips for data scientists and Machine learning engineers. When on the job hunt there are some tested tips and tricks that can be applied to your resume and interviews which will give you a leg up on the rest of the competition. Anthony Kelly host of the AI in Action podcast and Executive Search Consultant focused on Machine Learning and Data Science sat down with us to talk about what some of the best resumes and CV's have in common. We sp...
May 27, 2020•58 min•Season 1Ep. 13
MLOps meetup #12 // What are the advantages for a data scientist to know data engineering? What good is learning Data Engineering skills? These days full stack is overflowing with all the different things you need to know about so why learn data Engineering now? Our guest on this meetup will make the case for what the advantages are if you do decide to learn data engineering and also go into depth on how to do data engineering in the cloud. Dan Sullivan is a software architec...
May 21, 2020•1 hr
MLOps community meetup #11 Machine Learning at scale in Mercado Libre with Carlos de la Torre Mercado Libre hosts the largest online commerce and payments ecosystem in Latin America. The IT department built Fury: a PaaS framework for the development and deployment of multi-cloud, multi-technology, microservices. This platform leveraged the growth of the IT area, which now counts ~4000 people. As such, it lacked support for machine-learning based solutions: an experimentation ...
May 16, 2020•59 min•Season 1Ep. 11
MLOps.community meetup #9 with Charles Martin - 10 years deploying Machine Learning in the Enterprise: The Inside Scoop! Why do some machine learning projects succeed while others fall down completely? In this discussion, we will discuss the real-world challenges that Enterprises face in deploying ML solutions, focussing on challenges with existing, legacy dev-ops environments and how certain patterns of success emerge to help combat failure. Dr. Marti...
May 14, 2020•1 hr 3 min•Season 1Ep. 9
Meet up #10 Saurav Chakravorty sat down with us to talk about his vision of how MLOps reflect the old Indian story of blind men and an Elephant. As a lead data scientist at Brillo Saurav has build many MLOps pipelines and experienced using different ML platforms. He comes to talk with us about the difficulties of taking an ML platform from infancy to production and other key factors he has seen within the MLOps space. Today data science is a field that is an aggregation of people fr...
May 08, 2020•55 min•Season 1Ep. 10
Linkedin, Spotify, Volvo, JP Morgan, and many other market leaders are leveraging Kubeflow to simplify the creation and the efficient deployment of Machine Learning models on Kubernetes. This presentation will provide an update on the Kubeflow 1.0 release, and review the Community’s best practices to support Critical User Journeys, which optimize ML workflows. As a data scientist will often need to build (and save) hundreds of variants of their model, this session will provide a deeper div...
May 01, 2020•1 hr 4 min•Season 1Ep. 8
What does the MLOps pipeline at London Based FinTech startup TrueLayer look like? London Based Fintech start-up TrueLayer decided to use Machine Learning instead of a rule-based system in mid-2019 and in our 7th meetup we spoke to their lead data scientist Alex Spanos about everything that entailed. During the meetup, we dove into how TrueLayer architected their MLOps pipeline for their Open Banking API: more specifically which tools they use and why, what prompted them to use...
Apr 24, 2020•57 min•Season 1Ep. 7
In our 6th meetup, we spoke with the CEO of Scribble Data Dr. Venkata Pingali. Scribble helps build and operate production feature engineering platforms for sub-fortune 1000 firms. The output of the platforms is consumed by data science and analytical teams. In this talk we discuss how we understand the problem space, and the architecture of the platform that we built for preparing trusted model-ready datasets that are reproducible, auditable, and quality checked, and the lessons learned in the ...
Apr 16, 2020•59 min•Season 1Ep. 6
In our 5th meetup, we spoke with the Brasilian ML Engineer Flavio Clesio. Machine Learning Systems play a huge role in several businesses from the Banking industry to recommender systems in entertainment applications until health domains. The era of " A Data Scientist with a Script in a single machine " is officially over in high stakes ML. We're entering an era of Machine Learning Operations (MLOps) where those critical applications that impact society and businesses need to be aware of aspects...
Apr 15, 2020•55 min•Season 1Ep. 5