Data Science at Home - podcast cover

Data Science at Home

Francesco Gadaletadatascienceathome.podbean.com

Cutting through AI bullsh*t.
Come join the discussion on Discord!
https://discord.gg/4UNKGf3

Last refreshed:
Follow this podcast in the Metacast mobile app to refresh it and see new episodes.
Download Metacast podcast app
Podcasts are better in Metacast mobile app
Don't just listen to podcasts. Learn from them with transcripts, summaries, and chapters for every episode. Skim, search, and bookmark insights. Learn more

Episodes

Distill data and train faster, better, cheaper (Ep. 128)

Come join me in our Discord channel speaking about all things data science. Follow me on Twitch during my live coding sessions usually in Rust and Python Our Sponsors Amethix use advanced Artificial Intelligence and Machine Learning to build data platforms and predictive engines in domain like finance, healthcare, pharmaceuticals, logistics, energy. Amethix provide solutions to collect and secure data with higher transparency and disintermediation, and build the statistical models that will supp...

Nov 17, 202024 minEp. 128

Machine Learning in Rust: Amadeus with Alec Mocatta [RB] (ep. 127)

Come join me in our Discord channel speaking about all things data science. Follow me on Twitch during my live coding sessions usually in Rust and Python Our Sponsors ProtonVPN offers a simple and trusted solution to protect your internet connection and access blocked or restricted websites. All of ProtonVPN’s apps are open source and have been inspected by cybersecurity experts, and Proton is based in Switzerland, home to some of the world's strongest privacy laws Amethix use advanced Artificia...

Nov 11, 202024 minEp. 127

Top-3 ways to put machine learning models into production (Ep. 126)

Come join me in our Discord channel speaking about all things data science. Follow me on Twitch during my live coding sessions usually in Rust and Python Our Sponsors physicspodcast.com is not just a physics podcast. But also interviews with scientists, scholars, authors and reflections on the history and future of science and technology are all in the wheelhouse. Amethix use advanced Artificial Intelligence and Machine Learning to build data platforms and predictive engines in domain like finan...

Nov 07, 202020 minEp. 126

Remove noise from data with deep learning (Ep.125)

Come join me in our Discord channel speaking about all things data science. Follow me on Twitch during my live coding sessions usually in Rust and Python Our Sponsors ProtonMail is a secure and private email provider that protects yourmessages with end-to-end encryption and zero-access encryption so that besides you, noone can access them. Amethix use advanced Artificial Intelligence and Machine Learning to build data platforms and predictive engines in domain like finance, healthcare, pharmaceu...

Nov 03, 202024 minEp. 125

What is contrastive learning and why it is so powerful? (Ep. 124)

Come join me in our Discord channel speaking about all things data science. Follow me on Twitch during my live coding sessions usually in Rust and Python Our Sponsors The Monday Apps Challenge is bringing developers around the world together to compete in order to build apps that can improve the way teams work together on monday.com Amethix use advanced Artificial Intelligence and Machine Learning to build data platforms and predictive engines in domain like finance, healthcare, pharmaceuticals,...

Oct 30, 202026 minEp. 124

Neural search (Ep. 123)

Come join me in our Discord channel speaking about all things data science. Follow me on Twitch during my live coding sessions usually in Rust and Python This episode is supported by Monday.com The Monday Apps Challenge is bringing developers around the world together to compete in order to build apps that can improve the way teams work together on monday.com ....

Oct 23, 202019 minEp. 123

Let's talk about federated learning (Ep. 122)

Let's talk about federated learning. Why is it important? Why large organizations are not ready yet? Come join me in our Discord channel speaking about all things data science. Follow me on Twitch during my live coding sessions usually in Rust and Python This episode is supported by Monday.com The Monday Apps Challenge is bringing developers around the world together to compete in order to build apps that can improve the way teams work together on monday.com ....

Oct 18, 202030 minEp. 122

How to test machine learning in production (Ep. 121)

Come join me in our Discord channel speaking about all things data science. Follow me on Twitch during my live coding sessions usually in Rust and Python This episode is supported by Monday.com Monday.com bring teams together so you can plan, manage and track everything your team is working on in one centralized place The monday Apps Challenge is bringing developers around the world together to compete in order to build apps that can improve the way teams work together on monday.com ....

Oct 11, 202029 minEp. 121

Why synthetic data cannot boost machine learning (Ep. 120)

Come join me in our Discord channel speaking about all things data science. Follow me on Twitch during my live coding sessions usually in Rust and Python This episode is supported by Women in Tech by Manning Conferences

Sep 26, 202023 minEp. 120

Machine learning in production: best practices [LIVE from twitch.tv] (Ep. 119)

Hey there! Having the best time of my life ;) This is the first episode I record while I am live on my new Twitch channel :) So much fun! Feel free to follow me for the next live streaming. You can also see me coding machine learning stuff in Rust :)) Don't forget to jump on the usual Discord and have a chat I'll see you there!

Sep 16, 202038 minEp. 119

Testing in machine learning: checking deeplearning models (Ep. 118)

In this episode I speak with Adam Leon Smith , CTO at DragonFly and expert in testing strategies for software and machine learning. We cover testing with deep learning (neuron coverage, threshold coverage, sign change coverage, layer coverage, etc.), combinatorial testing and their practical aspects. On September 15th there will be a live@Manning Rust conference. In one Rust-full day you will attend many talks about what's special about rust, building high performance web services or video game,...

Sep 04, 202018 minEp. 118

Testing in machine learning: generating tests and data (Ep. 117)

In this episode I speak with Adam Leon Smith , CTO at DragonFly and expert in testing strategies for software and machine learning. On September 15th there will be a live@Manning Rust conference. In one Rust-full day you will attend many talks about what's special about rust, building high performance web services or video game, about web assembly and much more. If you want to meet the tribe, tune in september 15th to the live@manning rust conference....

Aug 29, 202020 minEp. 117

Why you care about homomorphic encryption (Ep. 116)

After deep learning, a new entry is about ready to go on stage. The usual journalists are warming up their keyboards for blogs, news feeds, tweets, in one word, hype. This time it's all about privacy and data confidentiality. The new words, homomorphic encryption . Join and chat with us on the official Discord channel. Sponsors This episode is supported by Amethix Technologies . Amethix works to create and maximize the impact of the world’s leading corporations, startups, and nonprofits, so they...

Aug 12, 202019 minEp. 116

Test-First machine learning (Ep. 115)

In this episode I speak about a testing methodology for machine learning models that are supposed to be integrated in production environments. Don't forget to come chat with us in our Discord channel Enjoy the show! -- This episode is supported by Amethix Technologies . Amethix works to create and maximize the impact of the world’s leading corporations, startups, and nonprofits, so they can create a better future for everyone they serve. They are a consulting firm focused on data science, machin...

Aug 03, 202020 minEp. 112

GPT-3 cannot code (and never will) (Ep. 114)

The hype around GPT-3 is alarming and gives and provides us with the awful picture of people misunderstanding artificial intelligence. In response to some comments that claim GPT-3 will take developers' jobs, in this episode I express some personal opinions about the state of AI in generating source code (and in particular GPT-3). If you have comments about this episode or just want to chat, come join us on the official Discord channel . This episode is supported by Amethix Technologies . Amethi...

Jul 26, 202019 minEp. 111

Make Stochastic Gradient Descent Fast Again (Ep. 113)

There is definitely room for improvement in the family of algorithms of stochastic gradient descent. In this episode I explain a relatively simple method that has shown to improve on the Adam optimizer. But, watch out! This approach does not generalize well. Join our Discord channel and chat with us. References More descent, less gradient Taylor Series...

Jul 22, 202021 minEp. 110

What data transformation library should I use? Pandas vs Dask vs Ray vs Modin vs Rapids (Ep. 112)

In this episode I speak about data transformation frameworks available for the data scientist who writes Python code. The usual suspect is clearly Pandas, as the most widely used library and de-facto standard. However when data volumes increase and distributed algorithms are in place (according to a map-reduce paradigm of computation), Pandas no longer performs as expected. Other frameworks play a role in such context. In this episode I explain the frameworks that are the best equivalent to Pand...

Jul 19, 202021 minEp. 109

[RB] It’s cold outside. Let’s speak about AI winter (Ep. 111)

In this episode I speak with Filip Piekniewski about some of the most worth noting findings in AI and machine learning in 2019. As a matter of fact, the entire field of AI has been inflated by hype and claims that are hard to believe. A lot of the promises made a few years ago have revealed quite hard to achieve, if not impossible. Let's stay grounded and realistic on the potential of this amazing field of research, not to bring disillusion in the near future. Join us to our Discord channel to d...

Jul 03, 202037 minEp. 108

Rust and machine learning #4: practical tools (Ep. 110)

In this episode I make a non exhaustive list of machine learning tools and frameworks, written in Rust. Not all of them are mature enough for production environments. I believe that community effort can change this very quickly. To make a comparison with the Python ecosystem I will cover frameworks for linear algebra (numpy), dataframes (pandas), off-the-shelf machine learning (scikit-learn), deep learning (tensorflow) and reinforcement learning (openAI). Rust is the language of the future. Happ...

Jun 29, 202024 minEp. 107

Rust and machine learning #3 with Alec Mocatta (Ep. 109)

In the 3rd episode of Rust and machine learning I speak with Alec Mocatta. Alec is a +20 year experience professional programmer who has been spending time at the interception of distributed systems and data analytics. He's the founder of two startups in the distributed system space and author of Amadeus , an open-source framework that encourages you to write clean and reusable code that works, regardless of data scale, locally or distributed across a cluster. Only for June 24th, LDN *Virtual* T...

Jun 22, 202024 minEp. 106

Rust and machine learning #2 with Luca Palmieri (Ep. 108)

In the second episode of Rust and Machine learning I am speaking with Luca Palmieri, who has been spending a large part of his career at the interception of machine learning and data engineering. In addition, Luca contributed to several projects closer to the machine learning community using the Rust programming language. Linfa is an ambitious project that definitely deserves the attention of the data science community (and it's written in Rust, with Python bindings! How cool??!). References Ser...

Jun 19, 202027 minEp. 105

Rust and machine learning #1 (Ep. 107)

This is the first episode of a series about the Rust programming language and the role it can play in the machine learning field. Rust is one of the most beautiful languages I have ever studied so far. I personally come from the C programming language, though for professional activities in machine learning I had to switch to the loved and hated Python language. This episode is clearly not providing you with an exhaustive list of the benefits of Rust, nor its capabilities. For this you can check ...

Jun 17, 202022 minEp. 104

Protecting workers with artificial intelligence (with Sandeep Pandya CEO Everguard.ai)(Ep. 106)

In this episode I have a chat with Sandeep Pandya, CEO at Everguard.ai a company that uses sensor fusion, computer vision and more to provide safer working environments to workers in heavy industry. Sandeep is a senior executive who can hide the complexity of the topic with great talent. This episode is supported by Pryml.io Pryml is an enterprise-scale platform to synthesise data and deploy applications built on that data back to a production environment. Test ideas. Launch new products. Fast. ...

Jun 15, 202016 minEp. 103

Compressing deep learning models: rewinding (Ep.105)

As a continuation of the previous episode in this one I cover the topic about compressing deep learning models and explain another simple yet fantastic approach that can lead to much smaller models that still perform as good as the original one. Don't forget to join our Slack channel and discuss previous episodes or propose new ones. This episode is supported by Pryml.io Pryml is an enterprise-scale platform to synthesise data and deploy applications built on that data back to a production envir...

Jun 01, 202016 minEp. 102

Compressing deep learning models: distillation (Ep.104)

Using large deep learning models on limited hardware or edge devices is definitely prohibitive. There are methods to compress large models by orders of magnitude and maintain similar accuracy during inference. In this episode I explain one of the first methods: knowledge distillation Come join us on Slack Reference Distilling the Knowledge in a Neural Network https://arxiv.org/abs/1503.02531 Knowledge Distillation and Student-Teacher Learning for Visual Intelligence: A Review and New Outlooks ht...

May 20, 202022 minEp. 101

Pandemics and the risks of collecting data (Ep. 103)

Codiv-19 is an emergency. True. Let's just not prepare for another emergency about privacy violation when this one is over. Join our new Slack channel This episode is supported by Proton. You can check them out at protonmail.com or protonvpn.com

May 08, 202020 minEp. 100

Why average can get your predictions very wrong (ep. 102)

Whenever people reason about probability of events, they have the tendency to consider average values between two extremes. In this episode I explain why such a way of approximating is wrong and dangerous, with a numerical example. We are moving our community to Slack . See you there!

Apr 19, 202015 minEp. 99

Activate deep learning neurons faster with Dynamic RELU (ep. 101)

In this episode I briefly explain the concept behind activation functions in deep learning. One of the most widely used activation function is the rectified linear unit (ReLU). While there are several flavors of ReLU in the literature, in this episode I speak about a very interesting approach that keeps computational complexity low while improving performance quite consistently. This episode is supported by pryml.io . At pryml we let companies share confidential data. Visit our website. Don't fo...

Apr 01, 202022 minEp. 98

WARNING!! Neural networks can memorize secrets (ep. 100)

One of the best features of neural networks and machine learning models is to memorize patterns from training data and apply those to unseen observations. That's where the magic is. However, there are scenarios in which the same machine learning models learn patterns so well such that they can disclose some of the data they have been trained on. This phenomenon goes under the name of unintended memorization and it is extremely dangerous. Think about a language generator that discloses the passwo...

Mar 23, 202024 minEp. 97

Attacks to machine learning model: inferring ownership of training data (Ep. 99)

In this episode I explain a very effective technique that allows one to infer the membership of any record at hand to the (private) training dataset used to train the target model. The effectiveness of such technique is due to the fact that it works on black-box models of which there is no access to the data used for training, nor model parameters and hyperparameters. Such a scenario is very realistic and typical of machine learning as a service APIs. This episode is supported by pryml.io , a pl...

Mar 14, 202020 minEp. 96
For the best experience, listen in Metacast app for iOS or Android