Machine Learning Archives - Software Engineering Daily - podcast cover

Machine Learning Archives - Software Engineering Daily

Machine Learning Archives - Software Engineering Dailysoftwareengineeringdaily.com
Machine learning and data science episodes of Software Engineering Daily.
Last refreshed:
Follow this podcast in the Metacast mobile app to refresh it and see new episodes.
Download Metacast podcast app
Podcasts are better in Metacast mobile app
Don't just listen to podcasts. Learn from them with transcripts, summaries, and chapters for every episode. Skim, search, and bookmark insights. Learn more

Episodes

Hyperparameter Tuning with Richard Liaw

Hyperparameters define the strategy for exploring a space in which a machine learning model is being developed. Whereas the parameters of a machine learning model are the actual data coming into a system, the hyperparameters define how those data points are fed into the training process for building a model to be used by an end consumer. A different set of hyperparameters will yield a different model. Thus, it is important to try different hyperparameter configurations to see which models end up...

Aug 28, 202054 min

Machine Learning Labeling and Tooling with Lukas Biewald

CrowdFlower was a company started in 2007 by Lukas Biewald, an entrepreneur and computer scientist. CrowdFlower solved some of the data labeling problems that were not being solved by Amazon Mechanical Turk. A decade after starting CrowdFlower, the company was sold for several hundred million dollars. Today, data labeling has only grown in volume and scope. But Lukas has moved on to a different part of the machine learning stack: tooling for hyperparameter search and machine learning monitoring....

Aug 26, 202047 min

ParlAI: Facebook Dialogue Platform with Stephen Roller

Chatbots are useful for developing well-defined applications such as first-contact customer support, sales, and troubleshooting. But the potential for chatbots is so much greater. Over the last five years, there have been numerous platforms that have arisen to allow for better, more streamlined chatbot creation. Dialogue software enables the creation of sophisticated chatbots. ParlAI is a dialogue platform built inside of Facebook. It allows for the development of dialogue models within Facebook...

Aug 20, 202051 min

SuperAnnotate: Image Annotation Platform with Vahan and Tigran Petrosyan

Image annotation is necessary for building supervised learning models for computer vision. An image annotation platform streamlines the annotation of these images. Well-known annotation platforms include Scale AI, Amazon Mechanical Turk, and Crowdflower. There are also large consulting-like companies that will annotate images in bulk for you. If you have an application that requires lots of annotation, such as self-driving cars, then you might be compelled to outsource this annotation to such a ...

Aug 19, 202055 min

Drug Simulations with Bryan Vicknair and Jason Walsh

Drug trials can lead to new therapeutics and preventative medications being discovered and placed on the market. Unfortunately, these drug trials typically require animal testing. This means animals are killed or harmed as a result of needing to verify that a drug will not kill humans. Animal testing is unavoidable, but the extent to which testing needs to occur can be reduced by inserting machine learning models which simulate the effects of a drug on the human body. If the simulated effect is ...

Jul 29, 202053 min

Metaflow: Netflix Machine Learning Platform with Savin Goyal

Netflix runs all of its infrastructure on Amazon Web Services. This includes business logic, data infrastructure, and machine learning. By tightly coupling itself to AWS, Netflix has been able to move faster and have strong defaults about engineering decisions. And today, AWS has such an expanse of services that it can be used as a platform to build custom tools. Metaflow is an open source machine learning platform built on top of AWS that allows engineers at Netflix to build directed acyclic gr...

Jul 13, 202053 min

Determined AI: Machine Learning Ops with Neil Conway

Developing machine learning models is not easy. From the perspective of the machine learning researcher, there is the iterative process of tuning hyperparameters and selecting relevant features. From the perspective of the operations engineer, there is a handoff from development to production, and the management of GPU clusters to parallelize model training. In the last five years, machine learning has become easier to use thanks to point solutions. TensorFlow, cloud provider tools, Spark, Jupyt...

Jul 08, 202042 min

Deepgram: End-to-End Speech Recognition with Scott Stephenson

Deepgram is an end-to-end deep learning platform for speech recognition. Unlike the general purpose APIs from Google or Amazon, Deepgram models are custom-trained for each customer. Whether the customer is a call center, a podcasting company, or a sales department, Deepgram can work with them to build something specific to their use case. Sound data is incredibly rich. Consider all the features in a voice recording: volume, intonation, inflection. And once the speech is transcribed, there are ma...

Jul 03, 202045 min

Cresta: Speech ML for Calls with Zayd Enam

At a customer service center, thousands of hours of audio are generated. This audio provides a wealth of information to transcribe and analyze. With the additional data of the most successful customer service representatives, machine learning models can be trained to identify which speech patterns are associated with a successful worker. By identifying these speaking patterns, a customer service center can continuously improve, with the different representatives learning the different patterns. ...

Jun 29, 202054 min

Traces: Video Recognition with Veronica Yurchuk and Kostyantyn Shysh (Summer Break Repeat)

Originally published October 8, 2019. We are taking a few weeks off. We’ll be back soon with new episodes. Video surveillance impacts human lives every day. On most days, we do not feel the impact of video surveillance. But the effects of video surveillance have tremendous potential. It can be used to solve crimes and find missing children. It can be used to intimidate journalists and empower dictators. Like any piece of technology, video surveillance can be used for good or evil. Video recognit...

Jun 25, 20201 hr 2 min

Stripe Machine Learning Infrastructure with Rob Story and Kelley Rivoire (Summer Break Repeat)

Originally published June 13, 2019. We are taking a few weeks off. We’ll be back soon with new episodes. Machine learning allows software to improve as that software consumes more data. Machine learning is a tool that every software engineer wants to be able to use. Because machine learning is so broadly applicable, software companies want to make the tools more accessible to the developers across the organization. There are many steps that an engineer must go through to use machine learning, an...

Jun 16, 20201 hr 5 min

Architects of Intelligence with Martin Ford (Summer Break Repeat)

Originally published January 31, 2019. We are taking a few weeks off. We’ll be back soon with new episodes. Artificial intelligence is reshaping every aspect of our lives, from transportation to agriculture to dating. Someday, we may even create a superintelligence–a computer system that is demonstrably smarter than humans. But there is widespread disagreement on how soon we could build a superintelligence. There is not even a broad consensus on how we can define the term “intelligence”. Informa...

Jun 15, 202059 min

Cruise Simulation with Tom Boyd

Cruise is an autonomous car company with a development cycle that is highly dependent on testing its cars–both in the wild and in simulation. The testing cycle typically requires cars to drive around gathering data, and that data to subsequently be integrated into a simulated system called Matrix. With COVID-19, the ability to run tests in the wild has been severely dampened. Cruise cannot put so many cars on the road, and thus has had to shift much of its testing procedures to rely more heavily...

Jun 12, 202053 min

Tecton: Machine Learning Platform from Uber with Kevin Stumpf

Machine learning workflows have had a problem for a long time: taking a model from the prototyping step and putting it into production is not an easy task. A data scientist who is developing a model is often working with different tools, or a smaller data set, or different hardware than the environment which that model will be deployed to. This problem existed at Uber just as it does at many other companies. Models were difficult to release, iterations were complicated, and collaboration between...

Jun 03, 202053 min

Edge Machine Learning with Zach Shelby

Devices on the edge are becoming more useful with improvements in the machine learning ecosystem. TensorFlow Lite allows machine learning models to run on microcontrollers and other devices with only kilobytes of memory. Microcontrollers are very low-cost, tiny computational devices. They are cheap, and they are everywhere. The low-energy embedded systems community and the machine learning community have come together with a collaborative effort called tinyML. tinyML represents the improvements ...

May 26, 202057 min

Rasa: Conversational AI with Tom Bocklisch

Chatbots became widely popular around 2016 with the growth of chat platforms like Slack and voice interfaces such as Amazon Alexa. As chatbots came into use, so did the infrastructure that enabled chatbots. NLP APIs and complete chatbot frameworks came out to make it easier for people to build chatbots. The first suite of chatbot frameworks were largely built around rule-based state machine systems. These systems work well for a narrow set of use cases, but fall over when it comes to chatbot mod...

Apr 24, 202052 min

Snorkel: Training Dataset Management with Braden Hancock

Machine learning models require the use of training data, and that data needs to be labeled. Today, we have high quality data infrastructure tools such as TensorFlow, but we don’t have large high quality data sets. For many applications, the state of the art is to manually label training examples and feed them into the training process. Snorkel is a system for scaling the creation of labeled training data. In Snorkel, human subject matter experts create labeling functions, and these functions ar...

Apr 09, 202053 min

Descript with Andrew Mason

Descript is a software product for editing podcasts and video. Descript is a deceptively powerful tool, and its software architecture includes novel usage of transcription APIs, text-to-speech, speech-to-text, and other domain-specific machine learning applications. Some of the most popular podcasts and YouTube channels use Descript as their editing tool because it provides a set of features that are not found in other editing tools such as Adobe Premiere or a digital audio workstation. Descript...

Mar 13, 202044 min

Anyscale with Ion Stoica

Machine learning applications are widely deployed across the software industry. Most of these applications used supervised learning, a process in which labeled data sets are used to find correlations between the labels and the trends in that underlying data. But supervised learning is only one application of machine learning. Another broad set of machine learning methods is described by the term “reinforcement learning.” Reinforcement learning involves an agent interacting with its environment. ...

Feb 13, 202049 min

Practical AI with Chris Benson

Machine learning algorithms have existed for decades. But in the last ten years, several advancements in software and hardware have caused dramatic growth in the viability of applications based on machine learning. Smartphones generate large quantities of data about how humans move through the world. Software-as-a-service companies generate data about how these humans interact with businesses. Cheap cloud infrastructure allows for the storage of these high volumes of data. Machine learning frame...

Dec 09, 201945 min

Future of Computing with John Hennessy Holiday Repeat

Originally published June 7, 2018 Moore’s Law states that the number of transistors in a dense integrated circuit doubles about every two years. Moore’s Law is less like a “law” and more like an observation or a prediction. Moore’s Law is ending. We can no longer fit an increasing amount of transistors in the same amount of space with a highly predictable rate. Dennard scaling is also coming to an end. Dennard scaling is the observation that as transistors get smaller, the power density stays co...

Nov 26, 201957 min

Incident Response Machine Learning with Chris Riley

Software bugs cause unexpected problems at every company. Some problems are small. A website goes down in the middle of the night, and the outage triggers a phone call to an engineer who has to wake up and fix the problem. Other problems can be significantly larger. When a major problem occurs, it can cause millions of dollars in losses and requires hours of work to fix. When software unexpectedly breaks, it is called an incident. To triage these incidents, an engineer uses a combination of tool...

Nov 12, 201946 min

Traces: Video Recognition with Veronica Yurchuk and Kostyantyn Shysh

Video surveillance impacts human lives every day. On most days, we do not feel the impact of video surveillance. But the effects of video surveillance have tremendous potential. It can be used to solve crimes and find missing children. It can be used to intimidate journalists and empower dictators. Like any piece of technology, video surveillance can be used for good or evil. Video recognition lets us make better use of video feeds. A stream of raw video doesn’t provide much utility if we can’t ...

Oct 08, 20191 hr 1 min

Cruise: Self-Driving Engineering with Mo Elshenawy

The development of self-driving cars is one of the biggest technological changes that is under way. Across the world, thousands of engineers are working on developing self-driving cars. Although it still seems far away, self-driving cars are starting to feel like an inevitability. This is especially true if you spend much time in downtown San Francisco, where you will see a self-driving car being tested every day. Much of the time, that self-driving car will be operated by Cruise. Cruise is a co...

Oct 01, 201949 min

People.ai: Machine Learning for Sales with Andrey Akselrod

A large sales organization has hundreds of sales people. Each of those sales people manages a set of accounts who they are trying to close sales deals on. Sales people are overseen by managers who ensure that the sales people are performing well. Directors and VPs ensure the scalability and health of the overall sales organization. The sales lifecycle mostly takes place within a piece of software called a CRM: customer relationship management. This tool documents the interactions between sales p...

Aug 07, 201945 min

WebAssembly on IoT with Jonathan Beri

“Internet of Things” is a term used to describe the increasing connectivity and intelligence of physical objects within our lives. IoT has manifested within enterprises under the term “Industrial IoT,” as wireless connectivity and machine learning have started to improve devices such as centrifuges, conveyor belts, and factory robotics. In the consumer space, IoT has moved slower than many people expected, and it remains to be seen when we will have widespread computation within consumer devices...

Jul 30, 201950 min

Afresh: Grocery Store Software with Volodymyr Kuleshov

A grocery store contains fruit, vegetables, meat, bread, and other items that can expire. In order to keep these items in stock, the store must be aware of how much food has been sold and what has gone bad. When a food item is low in stock, the store needs to order more of that food from a central distribution system. Managing food inventory is not simple. Some kinds of meat might expire faster than others. Avocados do not become ripe at the same rate as apples. In order to keep the shelves stoc...

Jun 26, 201945 min

Niantic Real World with Paul Franceus

Niantic is the company behind Pokemon Go, an augmented reality game where users walk around in the real world and catch Pokemon which appear on their screen. The idea for augmented reality has existed for a long time. But the technology to bring augmented reality to the mass market has appeared only recently. Improved mobile technology makes it possible for a smartphone to display rendered 3-D images over a video stream without running out of battery. Ingress was the first game to come out of Ni...

Jun 21, 201952 min

Stripe Machine Learning Infrastructure with Rob Story and Kelley Rivoire

Machine learning allows software to improve as that software consumes more data. Machine learning is a tool that every software engineer wants to be able to use. Because machine learning is so broadly applicable, software companies want to make the tools more accessible to the developers across the organization. There are many steps that an engineer must go through to use machine learning, and each additional step inhibits the chances that the engineer will actually get their model into producti...

Jun 13, 20191 hr 3 min

Augmented Reality Gaming with Tony Godar

Augmented reality applications can be used on smartphones and dedicated AR headsets. On smartphones, ARCore (Google) and ARKit (Apple) allow developers to build for the camera on a user’s smartphone. AR headsets such as Microsoft HoloLens and Magic Leap allow for a futuristic augmented reality headset experience. The most prominent use of augmented reality today is gaming, with a notable example being Niantic’s Pokemon Go. Tony Godar is a software engineer who works on augmented and virtual real...

May 28, 201945 min
For the best experience, listen in Metacast app for iOS or Android