In this week's episode, Mico Yuk, host of 'Analytics on Fire', joins Jon Krohn to share her effective business intelligence and analytics framework, BIDS, for persuading key decision makers. She crowns one "power" tool as the analytics king and discusses emerging tools that could challenge its dominance. Tune in for unapologetic insights on future and current BI trends and happenings from the world of BI and analytics.Additional materials: www.superdatascience.com/682Interested in sponsoring a S...
May 26, 2023•28 min
Unlock the power of XGBoost by learning how to fine-tune its hyperparameters and discover its optimal modeling situations. This and more, when best-selling author and leading Python consultant Matt Harrison teams up with Jon Krohn for yet another jam-packed technical episode! Are you ready to upgrade your data science toolkit in just one hour? Tune-in now!This episode is brought to you by Pathway, the reactive data processing framework, by Posit, the open-source data science company, and by Anac...
May 23, 2023•1 hr 12 min
Industrial machinery’s dependence on data science, tech stacks to build IoT platforms, and transitioning from data science to product: This week’s Friday episode with Allegra Alessi explores the minutiae of product ownership for the Internet of Things at packaging company Bobst. Join host Jon Krohn and his guest as they unpack how the IoT is leading factory production.Additional materials: www.superdatascience.com/680Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com...
May 19, 2023•30 min
Generative AI, MLOps, and making smart investments in AI: This week’s episode is critical listening for AI investors and generative AI creators. AI investor George Mathew talks with host Jon Krohn about the emerging generative AI stack, the critical elements of MLOps to ensure a scalable model, and the tools developers can use for a saleable product.This episode is brought to you by Posit, the open-source data science company, by AWS Inferentia, and by Anaconda, the world's most popular Python d...
May 16, 2023•1 hr 34 min
StableLM, the new family of open-source language models from the brilliant minds behind Stable Diffusion is out! Small, but mighty, these models have been trained on an unprecedented amount of data for single GPU LLMs. This week, Jon breaks down the mechanics of this model–see you there! Additional materials: www.superdatascience.com/678 Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
May 12, 2023•12 min
How does one use marketing analytics to drive business success? Avinash Kaushik, Chief Strategy Officer at Croud and former Sr. Director of Global Strategic Analytics at Google joins Jon Krohn live for an exciting episode that covers the transformative power of AI, his 'four clusters of intent' framework and the value of hands-on data tools. This episode is brought to you by Pathway, the reactive data processing framework, by Posit, the open-source data science company, and by Anaconda, the worl...
May 09, 2023•1 hr 28 min
Chinchilla AI, and fine-tuning proprietary tasks with large language models: On this week’s Five-Minute Friday, host Jon Krohn outlines the principles of the Chinchilla Scaling Laws, the incredible power of models such as Cerebras-GPT based on these laws, and the impact of scaling on the number of viable applications and commercial use cases.Additional materials: www.superdatascience.com/676Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship in...
May 05, 2023•13 min
Wrangling data in Pandas, when to use Pandas, Matplotlib or Seaborn, and why you should learn to create Python packages: Jon Krohn speaks with guest Stefanie Molin, author of Hands-On Data Analysis with Pandas. This episode is brought to you by Posit, the open-source data science company, and by AWS Inferentia. Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information. In this episode you will learn:• The advantages of using pandas over o...
May 02, 2023•1 hr 9 min
Models like Alpaca, Vicuña, GPT4All-J and Dolly 2.0 have relatively small model architectures, but they're prohibitively expensive to train even on a small amount of your own data. The standard model-training protocol can also lead to catastrophic forgetting. In this week's episode, Jon explores a solution to these problems, introducing listeners to Parameter-Efficient Fine-Tuning (PEFT) and the leading approach: Low-Rank Adaptation (LoRA).Additional materials: www.superdatascience.com/674Intere...
Apr 28, 2023•5 min
Vincent Gosselin, CEO and co-founder of Taipy, an open-source Python library, joins Jon Krohn to discuss how to accelerate productivity in Python and build scalable, reusable, and maintainable data pipelines. Gosselin shares his breadth of wisdom honed over his decades-long AI career. This episode is brought to you by Pathway, the reactive data processing framework, and by Posit, the open-source data science company. Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com...
Apr 25, 2023•1 hr 12 min
Get started with language models: Learn about the commercial-use options available for your business in this week’s Five-Minute Friday, where host Jon Krohn discusses four models that have many of the capabilities of ChatGPT and can run at a fraction of the cost.Additional materials: www.superdatascience.com/672Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
Apr 21, 2023•17 min
Get to grips with AWS, Azure, Google Cloud Platform on this week’s episode. Host Jon Krohn speaks with Kirill Eremenko and Hadelin de Ponteves about CloudWolf, a cloud computing educational platform that prepares students for certification in AWS (Amazon Web Services). Find out why an accreditation in cloud computing could be the safest investment for your data science career. This episode is brought to you by Posit, the open-source data science company, and by AWS Inferentia. Interested in spon...
Apr 18, 2023•1 hr 3 min
How does Meta AI's natural language model, LLaMa compare to the rest? Based on the Chinchilla scaling laws, LLaMa is designed to be smaller but more performant. But how exactly does it achieve this feat? It's all done by training a small model for a longer period of time. Discover how LLaMa compares to its competition, including GPT-3, in this week's episode. Additional materials: www.superdatascience.com/670Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast ...
Apr 14, 2023•13 min
In this episode, Jon Krohn welcomes Adrian Kosowski, Co-Founder and Chief Product Officer at Pathway, who shares insights on streaming data processing and reactive data processing, and how they're shaping the future of machine learning. Tune in now for an unforgettable episode. This episode is brought to you by Posit, the open-source data science company, and by AWS Inferentia. Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information. In...
Apr 11, 2023•1 hr 41 min
AI risks, RLHF, and inner alignment: GPT stands to give the business world a major boost. But with everyone racing either to develop products that incorporate GPT or use it to carry out critical tasks, what dangers could lie ahead in working with a tool that applies essentially unknowable means (inner alignments) to reach its goals? This week’s guest Jérémie Harris speaks with Jon Krohn about the essential need for anyone working with GPT to understand the impact of a system comprising inner ali...
Apr 07, 2023•56 min
GPT-4, augmenting human tasks with AI, and using GPT-4 commercially: Vin Vashishta speaks to host Jon Krohn about how to leverage GPT-4 and outperform your competitors in both speed and value. Learn how GPT-4 has outmatched its predecessors – and many skilled workers – in this latest iteration of large language models. This episode is brought to you by Pathway, the reactive data processing framework, by Posit, the open-source data science company, and by epic LinkedIn Learning instructor Keith M...
Apr 04, 2023•1 hr 5 min
GPT-4 has landed! But how well does it compare to GPT-3.5? Tune in to hear Jon stack its performance against its predecessor–the results might just blow your mind.Additional materials: www.superdatascience.com/666Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
Mar 31, 2023•12 min
Angel investor and data science consultant Josh Wills sits down with Jon Krohn to discuss his former roles (Google, Slack, and Cloudera) and the essential skills for engineering scalable machine learning projects. This episode is brought to you by Pathway, the reactive data processing framework, and by epic LinkedIn Learning instructor Keith McCormick. Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information. In this episode you will lea...
Mar 28, 2023•1 hr 28 min
Can ChatGPT make us better and faster in our work, and is it the future or just another fad? In this episode, Jon Krohn delves into a new study from MIT about the tool’s potential productivity for white-collar tasks.Additional materials: www.superdatascience.com/664Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
Mar 24, 2023•5 min
NLP, transformer architectures, and machines beating humans at their own game: Jon Krohn talks to Alexander H. Miller about his work in building a machine that can outsmart humans in the game of Diplomacy by engineering powers of persuasion and collusion to its own advantage. This episode is brought to you by epic LinkedIn Learning instructor Keith McCormick (linkedin.com/learning/instructors/keith-mccormick). Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcas...
Mar 21, 2023•1 hr 17 min
Our list of the top 10 SuperDataScience podcast episodes for 2022 is here. From Pandas to causality, AI breakthroughs and data storytelling, these were your most popular episodes of the year gone by. Additional materials: www.superdatascience.com/662 Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
Mar 17, 2023•8 min
Chip Huyen, co-founder of Claypot AI and author of O'Reilly's best-selling "Designing Machine Learning Systems" is here to share her expertise on designing production-ready machine learning applications, the importance of iteration in real-world deployment, and the critical role of real-time machine learning in various applications. Technical listeners like data scientists and machine learning engineers will definitely enjoy this one! This episode is brought to you by Pathway, the reactive data ...
Mar 14, 2023•1 hr 17 min
ChatGPT is well-known for its potential to disrupt the writing industry, but in what other, perhaps less explored, ways can we use the tool? In this episode, Jon Krohn outlines five critical ways that ChatGPT can augment a data scientist’s work. From generating code to acting as a translation tool for programming languages, listen in to hear why ChatGPT could become a vital part of every data scientist’s toolkit. Additional materials: www.superdatascience.com/660 Interested in sponsoring a Super...
Mar 10, 2023•4 min
NLP practitioners: this episode is for you. From the awareness of linguistic elements and annotation to getting the necessary people in the room, Vincent Warmerdam presents to Jon Krohn a recipe for a successful project and the open-source NLP tools to get there. This episode is brought to you by epic LinkedIn Learning instructor Keith McCormick (linkedin.com/learning/instructors/keith-mccormick). Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsor...
Mar 07, 2023•1 hr 21 min
What makes data products popular? Brian T. O'Neill, Founder and Principal of Designing for Analytics, returns to the podcast to help us crack the code on building data products that people love. Additional materials: www.superdatascience.com/658 Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
Mar 03, 2023•36 min
Data engineering educator Andreas Kretz joins Jon Krohn for a 1-hour primer that covers everything you need to know about the most in-demand role in data. From skills to tools, problem-solving processes and more, growing your knowledge of data engineering only improves your marketability, so tune in today if you're ready to future-proof your data career. This episode is brought to you by Glean (glean.io), the platform for data insights fast, and by epic LinkedIn Learning instructor Keith McCormi...
Feb 28, 2023•1 hr 10 min
How to attract an AI recruiter’s attention: In this episode, Jon Krohn and Tribe AI CEO Jaclyn Rice Nelson break down the key ingredients needed to make a Tribe AI recruiter say “yes!” Get Jaclyn’s top tips for forward-thinking AI talent, the skills you need to learn, and the in-demand roles on Tribe’s list of clients. Additional materials: www.superdatascience.com/656 Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
Feb 24, 2023•42 min
Transparent data science, profitable AI, and what’s missing from a data science education: Pandata’s Data Scientist in Residence Keith McCormick and Jon Krohn discuss how “insights” can never be the end product of a data science project, how to ensure you have a specific goal at the start of a project that is related to revenue, and why there is so much miscommunication between data scientists and their clients. Exclude the C-suite at your peril!This episode is brought to you by Glean (glean.io)...
Feb 21, 2023•1 hr 43 min
14-year-old AI prodigy Mike Wimmer joins Jon Krohn to discuss his latest projects. Whether he's using AI to help conserve the world's coral reefs or launching his new IOT-based company, Mike is an endless source of inspiration in the field of AI. Additional materials: www.superdatascience.com/654 Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
Feb 17, 2023•45 min
Carlos Aguilar, the founder and CEO of Glean, a data exploration and visualization platform, knows a thing or two about starting and growing a tech startup. After recently raising a $7 million seed round, he sits down with Jon Krohn to dive into the makings of his platform and shares tips for building a great founding team and how to delight early adopters. In this episode you will learn:• How Glean extracts actionable insights from their client's data warehouses [06:48]• What sets Glean apart f...
Feb 14, 2023•57 min