Hugo speaks with Allen Downey about uncertainty in data science. Allen is a professor of Computer Science at Olin College and the author of a series of free, open-source textbooks related to software and data science. Allen and Hugo speak about uncertainty in data science and how we, as humans, are not always good at thinking about uncertainty, which we need be to in such an uncertain world. Should we have been surprised at the outcome of the 2016 election? What approaches can we, as a data repo...
Sep 24, 2018•59 min•Transcript available on Metacast Hugo speaks with Renee Teate about the many paths to becoming a data scientist. Renee is a Data Scientist at higher ed analytics start-up HelioCampus, and creator and host of the Becoming a Data Scientist Podcast. In addition to discussing the many possible ways to become becoming a data scientist, they will discuss the common data scientist profiles and how to figure out which ones may be a fit for you. They’ll also dive into the fact that you need to figure out both where you are in terms of s...
Sep 17, 2018•1 hr 1 min•Transcript available on Metacast Hugo speaks with Eric Colson, Chief Algorithms Officer at Stitch Fix, an online personal styling service reinventing the shopping experience by delivering one-to-one personalization to their clients through the combination of data science and human judgment. Eric is responsible for the creation of dozens of algorithms at Stitch Fix that are pervasive to nearly every function of the company, from merchandise, inventory, and marketing to forecasting and demand, operations, and the styling recommen...
Sep 10, 2018•1 hr•Transcript available on Metacast Meet Tanya Cashorali, a founding partner of TCB Analytics, a Boston-based data consultancy. Tanya started her career in bioinformatics and has applied her experience to other industries such as healthcare, finance, retail, and sports. We’ll be talking about what it means to be a data consultant, the wide range of industries that Tanya works in, the impact of data products in her work and the importance of rapid prototyping and getting MVPs or minimum viable products out the door. How does Tanya ...
Sep 03, 2018•51 min•Transcript available on Metacast Hugo speaks with JD Long, VP of risk management for Renaissance reinsurance, about applications of data science techniques to the omnipresent worlds of insurance, reinsurance, risk management and uncertainty. What are the biggest challenges in insurance and reinsurance that data science can impact? How does JD go about building risk representations of every deal? How can thinking in a distributed fashion allow us to think about risk and uncertainty? What is the role of empathy in data science?...
Aug 27, 2018•1 hr•Transcript available on Metacast Hugo speaks with Christie Bahlai, Assistant Professor at Kent State University, about data science, ecology, and the adoption of techniques such as machine learning in academic research. What are the biggest challenges in ecology that data science can help to solve? What does the intersection of open science and data science look like? In scientific research, what is happening at the interface between data science & machine learning methods, which are pattern-based, and traditional research ...
Aug 20, 2018•56 min•Transcript available on Metacast Hugo speaks with Yves Hilpisch about how data science is disrupting finance. Yves’ name is synonymous with Python for Finance and he is founder and managing partner of The Python Quants, a group focusing on the use of open source technologies for financial data science, artificial intelligence, algorithmic trading and computational finance. Why are banks such as Bank of America & JP Morgan adopting the open source data science ecosystem? What are the major sub-disciplines of Finance that dat...
Aug 13, 2018•59 min•Transcript available on Metacast Hugo speaks with Amber Thomas about data journalism, interactive visualization and data storytelling. Amber is a journalist-engineer at The Pudding, which is a collection of data-driven, visual essays. We’ll discuss the ins and outs of what it takes to tell interactive journalistic stories using data visualization and, in the process, we’ll find out what it takes to be successful at data journalism, the trade-off between being being a generalist and specialist and much more. We’ll explore these ...
Aug 06, 2018•56 min•Transcript available on Metacast What are the biggest challenges in Pharmaceuticals that data science can help to solve? How are data science and statistics generally embedded in organizations such as Pfizer? What aspects of the pharmaceutical business run the gamut of nonclinical statistics? Hugo speaks with Max Kuhn, a software engineer at RStudio who was previously Senior Director of Nonclinical Statistics at Pfizer Global R&D. Max was applying models in the pharmaceutical and diagnostic industries for over 18 years....
Jul 30, 2018•1 hr•Transcript available on Metacast Hugo speaks with Derek Johnson, an epidemiologist with Doctors without Borders. Derek leverages statistical methods, experimental design and data scientific techniques to investigate the barriers impeding people from accessing health care in Lahe Township, Myanmar. If you thought data science was all machine learning, SQL databases and convolutional neural nets, this is gonna be a wild ride as to get the data for their baseline health assessments, Derek and his team ride motorcycles into village...
Jul 23, 2018•55 min•Transcript available on Metacast Hugo speaks with Alan Nichol about chatbots, conversational software and data science. Alan is co-founder and CTO of Rasa, who build open source machine learning tools for developers and product teams to expand bots beyond answering simple questions. Which verticals are conversational software currently having the biggest impact on? What are the biggest challenges facing the fields of chatbots and conversational software? What misapprehensions do we as a society have about these technologies tha...
Jul 16, 2018•57 min•Transcript available on Metacast Hugo speaks with Taras Gorishnyy, a Senior Analytics Manager at McKinsey and Head of Data Science at QuantumBlack, a McKinsey company. They discuss the role of data science in management consulting, what it takes to change organizations through data science, how the different moving parts of data science have evolved over the past decade and in which direction they’re heading. You’ll see the impact that data science can have not only in tech, but also in such various verticals as retail, agricul...
Jul 09, 2018•57 min•Transcript available on Metacast Omoju Miller, a Senior Machine Learning Data Scientist with Github, speaks with Hugo about the role of data science in product development at github, what it means to “use computation to build products to solve real-life decision making, practical challenges” and what building data products at github actually looks like. Machine learning has the power to automate so much of the drudgery around data science & software engineering, from automated code review to flagging security vulnerabilitie...
Jul 02, 2018•59 min•Transcript available on Metacast What are best practices for organizing data science teams? Having data scientists distributed through companies or having a Centre of Excellence? What are the most important skills for data scientists? Is the ability to use the most sophisticated deep learning models more important than being able to make good powerpoint slides? Find out in this conversation with Jacqueline Nolis, a data science leader in the Seattle area with over a decade of experience. Jacqueline is currently running a consul...
Jun 25, 2018•59 min•Transcript available on Metacast What are the biggest challenges currently facing data security and privacy? What does the GDPR mean for civilians, working data scientists and businesses around the world? Is data anonymization actually possible or a pipe dream? Find out in Hugo's conversation with Katharine Jarmul, a data scientist, consultant, educator and co-founder of KI protect, a company that provides real-time protection for your data infrastructure, data science and AI. Links from the show KI Protect, providing real-time...
Jun 18, 2018•57 min•Transcript available on Metacast Why are spreadsheets ubiquitous in data analytics, why are so many data scientists anti-spreadsheet? Join Jenny Bryan, a software engineer at RStudio & recovering biostatistician who takes special delight in eliminating the small agonies of data analysis, and Hugo to discover why spreadsheets are in fact necessary in data analytics and how spreadsheet workflows can be incorporated into more general data science flows in sustainable and healthy ways. Welcome to the future. Links from the show...
Jun 11, 2018•59 min•Transcript available on Metacast Community building is an essential aspect of data science. But how do you do it? Find out in Hugo's conversation with Jared Lander, organizer of the New York Open Statistical Programming Meetup and the New York R Conference. Jared is also the Chief Data Scientist of Lander Analytics, a data science consultancy based in New York City and an Adjunct Professor of Statistics at Columbia University. How does Jared think about creating safe and welcoming spaces for budding and practicing data scientis...
Jun 04, 2018•53 min•Transcript available on Metacast "Cloud computing is a huge revolution in the computing space, and it's also probably going to be one of the most transformative technologies that any of us experience in our lifetime. " Paige Bailey, Senior Cloud Developer Advocate at Microsoft, in this episode of DataFramed. In this conversation with Hugo, Paige reports from the frontier of cloud-based data science technologies, having just been at the Microsoft Build and Google I/O conferences. What is the future of data science in the cloud? ...
May 28, 2018•1 hr•Transcript available on Metacast What do online experiments, data science and product development look like at Booking.com, the world’s largest accommodations provider? Join Hugo's conversation with Lukas Vermeer to find out. Lukas is responsible for experimentation at Booking in the broadest sense of the word: from Infrastructure and Tools used to run experiments, Methodology and Metrics that help people make decisions to Training and Culture that help people understand what to do. They'll be talking about how Booking leverage...
May 21, 2018•59 min•Transcript available on Metacast Building models of the world is dangerous and there are pitfalls everywhere, even down to the assumptions that you make. To find out about many statistical pitfalls, and how to build more robust data scientific models using statistical modeling, whether it be in tech, epidemiology, finance or anything else, join Hugo's chat with Michael Betancourt, a physicist, statistician and one of the core developers of the open source statistical modeling platform Stan.
May 14, 2018•57 min•Transcript available on Metacast How can data science help in the fight against cancer? What are its limitations? Find out in this conversation from the frontier of research. Hugo speaks with Sandy Griffith from Flatiron Health, a healthcare technology and services company focused on accelerating cancer research and improving patient care. Sandy is Principal methodologist on Flatiron's Quantitative Sciences team and is tasked with leveraging data science "To improve lives by learning from the experience of every cancer patient"...
May 07, 2018•54 min•Transcript available on Metacast Anthony Goldbloom, CEO of Kaggle, speaks with Hugo about Kaggle, data science communities, reproducible data science, machine learning competitions and the future of data science in the cloud. If you thought that Kaggle was merely a platform for machine learning competitions, you have to check out this chat, because these ML comps account for less than a third of activity on Kaggle today. In the discussion: Kaggle kernels for reproducible data science and the evolution of the Kaggle public data ...
Apr 30, 2018•52 min•Transcript available on Metacast "We should be looking at Automated Machine Learning tools as more like data science assistants, rather than replacements for data scientists" -- Randy Olson, Lead Data Scientist at Life Epigenetics, Inc. Randy specializes in artificial intelligence, machine learning, and created TPOT, a Data Science Assistant and a Python Automated Machine Learning tool that optimizes machine learning pipelines using genetic programming. Will the future of data science be automated? Which verticals will experien...
Apr 23, 2018•1 hr•Transcript available on Metacast Michelle Gill, a deep learning expert at NVIDIA, an Artificial Intelligence company that builds GPUs, the processors that everybody uses for deep learning, speaks with Hugo about the modern superpower of deep learning and where it has the largest impact, past, present and future, filtered through the lens of Michelle's work at NVIDIA. Where is the modern superpower of deep learning most effective? Where is it not? Where should we channel our skepticism of the hype surrounding it?
Apr 16, 2018•51 min•Transcript available on Metacast Sebastian Raschka, a machine learning aficionado, data analyst, author, python programmer, open source contributor, computational biologist, and occasional blogger, speaks with Hugo about the role of data science in modern biology and the power of deep learning in today's rapidly evolving data science landscape. How is Sebastian using deep learning to build facial recognition software that also prevents racial and gender profiling? Check out this week's episode to find out.
Apr 09, 2018•58 min•Transcript available on Metacast Drew Conway, world-renowned data scientist, entrepreneur, author, speaker and creator of the Data Science Venn Diagram speaks with Hugo about how to build data science teams, along with the unique challenges of building data science products for industrial users. How does Drew now view the Venn circles he created, those of hacking skills, mathematical and statistical knowledge and substantive expertise, when building out data science teams?
Mar 26, 2018•1 hr•Transcript available on Metacast Fake news: how can data science and deep learning be leveraged to detect it? Come on a journey with Mike Tamir, Head of Data Science at Uber ATG, who is building out a data science product that classifies text as news, editorial, satire, hate speech and fake news, among others. We'll also see what types of unique challenges Mike faced in his work at Takt, using data science to service the needs of Fortune 500 companies such as Starbucks. Links from the show FROM THE INTERVIEW FakerFact(Chrome Ex...
Mar 12, 2018•58 min•Transcript available on Metacast Nuclear engineering, data science and open source software development: where do these all intersect? To find out, join Hugo and Katy Huff, Assistant Professor in the Department of Nuclear, Plasma, and Radiological Engineering at the University of Illinois where she leads the Advanced Reactors and Fuel Cycles research group.
Mar 05, 2018•57 min•Transcript available on Metacast How does data science help Buzzfeed achieve online virality? What type of mass online experiments do data scientists at BuzzFeed run for this purpose? What products do they develop to make all of this easy and intuitive for content producers? Find out about all of this and more in this episode when Hugo talks with Adam Kelleher, Principal Data Scientist at BuzzFeed and Adjunct Assistant Professor at Columbia University. They'll also dive into the role of thinking about causality in modern data s...
Feb 26, 2018•1 hr•Transcript available on Metacast Air pollution, the environment and data science: where do these intersect? Find out in this episode of DataFramed, in which Hugo speaks with Roger Peng, Professor in the Department of Biostatistics at the Johns Hopkins Bloomberg School of Public Health, co-director of the Johns Hopkins Data Science Lab and co-founder of the Johns Hopkins Data Science Specialization. Join our discussion about data science, it's role in researching the environment and air pollution, massive open online courses for...
Feb 19, 2018•55 min•Transcript available on Metacast