Machine Learning Archives - Software Engineering Daily - podcast cover

Machine Learning Archives - Software Engineering Daily

Machine Learning Archives - Software Engineering Dailysoftwareengineeringdaily.com
Machine learning and data science episodes of Software Engineering Daily.
Last refreshed:
Follow this podcast in the Metacast mobile app to refresh it and see new episodes.
Download Metacast podcast app
Podcasts are better in Metacast mobile app
Don't just listen to podcasts. Learn from them with transcripts, summaries, and chapters for every episode. Skim, search, and bookmark insights. Learn more

Episodes

Go Data Science with Daniel Whitenack

Data science is typically done by engineers writing code in Python, R, or another scripting language. Lots of engineers know these languages, and their ecosystems have great library support. But these languages have some issues around deployment, reproducibility, and other areas. The programming language Golang presents an appealing alternative for data scientists. Daniel Whitenack transitioned from doing most of his data science work in Python to writing code in Golang. In this episode, Daniel ...

Feb 09, 201756 min

Translation with Vasco Pedro

Translation is a classic problem in computer science. How do you translate a sentence from one human language into another? This seems like a problem that computers are well-suited to solve. Languages follow well-defined rules, we have lots of sample data to train our machine learning models. And yet, the problem has not been solved–largely because languages don’t always follow rules. We have idioms and subtle contextual clues that make it hard to provide a computer with hard and fast rules for ...

Jan 25, 201751 min

Medical Machine Learning with Razik Yousfi and Leo Grady

Medical imaging is used to understand what is going on inside the human body and prescribe treatment. With new image processing and machine learning techniques, the traditional medical imaging techniques such as CT scans can be enriched to get a more sophisticated diagnosis. HeartFlow uses data from a standard CT scan to model a human heart and understand blockages of blood flow using simulations of fluid dynamics. In today’s episode, Razik Yousfi and Leo Grady from HeartFlow describe the data p...

Jan 17, 201753 min

Python Data Visualization with Jake VanderPlas

Data visualization tools are required to translate the findings of data scientists into charts, graphs, and pictures. Understanding how to utilize these tools and display data is necessary for a data scientist to communicate with people in other domains. In this episode, Srini Kadamati hosts a discussion with Jake VanderPlas about the Python ecosystem for data science and the different attempts at creating a data visualization library. Jake VanderPlas is the Director of Research for Physical Sci...

Jan 16, 201744 min

PANCAKE STACK Data Engineering with Chris Fregly

Data engineering is the software engineering that enables data scientists to work effectively. In today’s episode, we explore the different sides of data engineering–the data science algorithms that need to be processed and the implementation of software architectures that enable those algorithms to run smoothly. The PANCAKE STACK is a 12-letter acronym that Chris Fregly gave to a collection of data engineering technologies including Presto, Cassandra, Kafka, Elastic Search, and Spark. In his cu...

Oct 17, 201655 min

Scikit-learn with Andreas Mueller

Scikit-learn is a set of machine learning tools in Python that provides easy-to-use interfaces for building predictive models. In a previous episode with Per Harald Borgen about Machine Learning For Sales, he illustrated how easy it is to get up and running and productive with scikit-learn, even if you are not a machine learning expert. Srini Kadamati hosts today’s show and interviews Andreas Mueller, a core committer to scikit-learn. Srini and Andreas discuss the background and implementation o...

Sep 27, 201631 min

Music Deep Learning with Feynman Liang

Machine learning can be used to generate music. In the case of Feynman Liang’s research project BachBot, the machine learning model is seeded with the music of famous composer Bach. The music that BachBot creates sounds remarkably similar to Bach, although it has been generated by an algorithm, not by a human. BachBot is a research project on computational creativity. Feynman Liang created BachBot using Python machine learning tools to build a long-short term memory model. Our conversation explo...

Sep 02, 201644 min

Automated Content with Robbie Allen

You have probably read a news article that was written by a machine. When earnings reports come out, or a series of sports events like the Olympics occurs, there are so many small stories that need to be written that a news organization like the Associated Press would have to use all of its resources to write enough content to cover it all. Wordsmith is a tool for automated content generation, and today’s guest Robbie Allen is the CEO of Automated Insights, the company that makes Wordsmith. He t...

Sep 01, 201648 min

Artificial Intelligence with Oren Etzioni

Research in artificial intelligence takes place mostly at universities and large corporations, but both of these types of institutions have constraints that cause the research to proceed a certain way. In a university, basic research might be hindered by lack of funding. At a big corporation, the researcher might be encouraged to study a domain that is not squarely in the interest of public good–such as targeted advertising. Oren Etzioni is the CEO of the Allen Institute for Artificial Intellige...

Aug 29, 20161 hr 2 min

TensorFlow in Practice with Rajat Monga

TensorFlow is Google’s open source machine learning library. Rajat Monga is the engineering director for TensorFlow. In this episode, we cover how to use TensorFlow, including an example of how to build a machine learning model to identify whether a picture contains a cat or not. TensorFlow was built with the mission of simplifying the process of deploying a machine learning model from research to production, so we also talk about that, as well as how TensorFlow can be used effectively in combin...

Aug 18, 201643 min

Data Validation with Dan Morris

Data Validation is the process of ensuring that data is accurate. In many software domains, an application is pulling in large quantities of data from external sources. That data will eventually be exposed to users, and it needs to be correct. Radius Intelligence is a company that aggregates data on small businesses. In order to ensure that business addresses and phone numbers are correct, Radius uses human data validation to ensure that their machine-gathered data is correct. On today’s episode...

Aug 17, 201640 min

Machine Learning for Sales with Per Harald Borgen

Machine learning has become simplified. Similar to how Ruby on Rails made web development approachable, scikit-learn takes away much of the frustrating aspects of machine learning, and lets the developer focus on building functionality with high-level APIs. Per Harald Borgen is a developer at Xeneta. He started programming fairly recently, but has already built a machine learning application that cuts down on the time his sales team has to spend qualifying leads. What I found most interesting ab...

Aug 16, 201643 min

Phone Spam with Truecaller CTO Umut Alp

The war against spam has been going on for decades. Email spam blockers and ad blockers help protect us from unwanted messages in our communication and browsing experience. These spam prevention tools are powered by machine learning, which catches most of the emails and ads that we don’t want to see. TrueCaller is a company that is bringing this quality of spam detection to our phone call systems. Umut Alp is the CTO of TrueCaller, and he joins the show today to break down the engineering proble...

Jun 08, 201653 min

Machine Learning in Healthcare with David Kale

“Building a model to predict disease and deploying that in the wild – the bar for success is much higher there than, say, deciding what ad to show you.” Diagnosing illness today requires the trained eye of a doctor. With machine learning, we might someday be able to diagnose illness using only a data set. Today on Software Engineering Daily, we are joined by David Kale, a researcher at the intersection of machine learning and clinical data. We discuss the machine learning and research techniques...

Mar 08, 201657 min

Data Science at Monsanto with Tim Williamson

“Nothing’s cool unless you call it ‘as a service.’ ” Monsanto is a company that is known for its chemical and biological engineering. It is less well known for its data science and software engineering teams. Tim Williamson is a data scientist at Monsanto, and on today’s show he talked about how he and a small group of engineers at Monsanto dramatically shifted the culture around data science-driven genetic engineering. In this episode, Tim explains how useful graph databases are for modeling th...

Feb 29, 201655 min

Deep Learning and Keras with François Chollet

“I definitely think we can try to abstract away the first principles of intelligence and then try to go from these principles to an intelligent machine that might look nothing like the brain.” Keras is a minimalist, highly modular neural networks library, written in Python and capable of running on top of either TensorFlow or Theano. It was developed with a focus on enabling fast experimentation. In this episode, François discusses the state of deep learning, and explains why the field is experi...

Jan 29, 201652 min

Machine Learning for Businesses with Joshua Bloom

“You’ve got software engineers who are interested in machine learning, and think what they need to do is just bring in another module and then that will solve their problem. It’s particularly important for those people to understand that this is a different type of beast.” Machine learning is something that many business are starting to tack onto their existing processes. Yet, to add machine learning capabilities after the fact is often a fool’s errand. Joshua argues that machine learning cannot...

Jan 19, 201656 min

TensorFlow with Greg Corrado

“You don’t mind if failures slow things down, but its very important that failures do not stop forward progress.” TensorFlow is an open source machine learning library intended to bring large-scale, distributed machine learning and deep learning to everyone. Google recently released the framework to the public as a second-generation API, having learned from the successes and failures of DistBelief . Greg Corrado is a senior research scientist and tech lead at Google, where he focuses on the rese...

Dec 15, 201540 min

Data Science at Spotify with Boxun Zhang

“I normally try to sit together or very close to a product team or engineering team. And by doing so, I get very close to the source of all kinds of challenging problems.” Spotify is a streaming music service that uses data science and machine learning to implement product features such as recommendation systems and music categorization, but also to answer internal questions. Boxun Zhang is a data scientist at Spotify where he focuses on understanding user behavior within the product. Questions ...

Dec 11, 201556 min

Learning Machines with Richard Golden

“When I was a graduate student, I was sitting in the office of my advisor in electrical engineering and he said, ‘Look out that window – you see a Volkswagon, I see a realization of a random variable.’ ” Richard Golden is the host of Learning Machines 101 , a podcast that covers artificial intelligence and machine learning topics. Dr. Golden is also a full-time Professor of Cognitive Science and Electrical Engineering at UT Dallas. Questions What is machine learning? What are the fundamental con...

Dec 08, 201556 min

Machine Learning and Technical Debt with D. Sculley

“Changing anything changes everything.” Technical debt, referring to the compounding cost of changes to software architecture, can be especially challenging in machine learning systems. D. Sculley is a software engineer at Google, focusing on machine learning, data mining, and information retrieval. He recently co-authored the paper Machine Learning: The High Interest Credit Card of Technical Debt . Questions How do you define technical debt? Why does technical debt tend to compound like financi...

Nov 17, 201532 min

Bridging Data Science and Engineering with Greg Lamp

Current infrastructure makes it difficult for data scientists to share analytical models with the software engineers who need to integrate them. Yhat is an enterprise software company tackling the challenge of how data science gets done. Their products enable companies and users to easily deploy data science environments and translate analytical models into production code. Greg Lamp is the Co-founder and CTO of Yhat and previously worked as a product manager in financial services. Yhat was part...

Oct 05, 201547 min

Kaggle with Ben Hamner

Data science competitions are an effective way to crowdsource the best solutions for challenging datasets. Kaggle is a platform for data scientists to collaborate and compete on machine learning problems with the opportunity to win money from the competitions’ sponsors. Ben Hamner is the co-founder and CTO of Kaggle. Questions What is Kaggle? How does the experience of an individual competitor compare to the experience of a data science team? What is Kaggle’s tech stack? Do companies collect too...

Oct 03, 201550 min

Teaching Data Science with Vik Paruchuri

There is a need for more data scientists to make sense of the vast amounts of data we produce and store. Dataquest is an in-browser platform for learning data science that is tackling this problem. Vik Paruchuri is the founder of Dataquest. He was previously a machine learning engineer at EdX and before that a U.S. diplomat. Questions What is data science? How does data science compare to software engineering? How does someone new to data science go about starting off at Kaggle? In machine learn...

Sep 30, 201545 min
For the best experience, listen in Metacast app for iOS or Android