#92 How to Make Decision Under Uncertainty, with Gerd Gigerenzer - podcast episode cover

#92 How to Make Decision Under Uncertainty, with Gerd Gigerenzer

Oct 04, 20231 hr 5 minSeason 1Ep. 92
--:--
--:--
Listen in podcast apps:

Episode description

Proudly sponsored by PyMC Labs, the Bayesian Consultancy. Book a call, or get in touch!


I love Bayesian modeling. Not only because it allows me to model interesting phenomena and learn about the world I live in. But because it’s part of a broader epistemological framework that confronts me with deep questions — how do you make decisions under uncertainty? How do you communicate risk and uncertainty? What does being rational even mean?

Thankfully, Gerd Gigerenzer is there to help us navigate these fascinating topics. Gerd is the Director of the Harding Center for Risk Literacy of the University of Potsdam, Germany.

Also Director emeritus at the Max Planck Institute for Human Development, he is a former Professor of Psychology at the University of Chicago and Distinguished Visiting Professor at the School of Law of the University of Virginia. 

Gerd has written numerous awarded articles and books, including Risk Savvy, Simple Heuristics That Make Us Smart, Rationality for Mortals, and How to Stay Smart in a Smart World.

As you’ll hear, Gerd has trained U.S. federal judges, German physicians, and top managers to make better decisions under uncertainty.

But Gerd is also a banjo player, has won a medal in Judo, and loves scuba diving, skiing, and, above all, reading.

Our theme music is « Good Bayesian », by Baba Brinkman (feat MC Lars and Mega Ran). Check out his awesome work at https://bababrinkman.com/ !

Thank you to my Patrons for making this episode possible!

Yusuke Saito, Avi Bryant, Ero Carrera, Giuliano Cruz, Tim Gasser, James Wade, Tradd Salvo, William Benton, James Ahloy, Robin Taylor,, Chad Scherrer, Zwelithini Tunyiswa, Bertrand Wilden, James Thompson, Stephen Oates, Gian Luca Di Tanna, Jack Wells, Matthew Maldonado, Ian Costley, Ally Salim, Larry Gill, Ian Moran, Paul Oreto, Colin Caprani, Colin Carroll, Nathaniel Burbank, Michael Osthege, Rémi Louf, Clive Edelsten, Henri Wallen, Hugo Botha, Vinh Nguyen, Marcin Elantkowski, Adam C. Smith, Will Kurt, Andrew Moskowitz, Hector Munoz, Marco Gorelli, Simon Kessell, Bradley Rode, Patrick Kelley, Rick Anderson, Casper de Bruin, Philippe Labonde, Michael Hankin, Cameron Smith, Tomáš Frýda, Ryan Wesslen, Andreas Netti, Riley King, Yoshiyuki Hamajima, Sven De Maeyer, Michael DeCrescenzo, Fergal M, Mason Yahr, Naoya Kanai, Steven Rowland, Aubrey Clayton, Jeannine Sue, Omri Har Shemesh, Scott Anthony Robson, Robert Yolken, Or Duek, Pavel Dusek, Paul Cox, Andreas Kröpelin, Raphaël R, Nicolas Rode, Gabriel Stechschulte, Arkady, Kurt TeKolste, Gergely Juhasz, Marcus Nölke, Maggi Mackintosh, Grant Pezzolesi, Avram Aelony, Joshua Meehl, Javier Sabio, Kristian Higgins, Alex Jones, Gregorio Aguilar, Matt Rosinski, Bart Trudeau and Luis Fonseca.

Visit https://www.patreon.com/learnbayesstats to unlock exclusive Bayesian swag ;)

Links from the show:

Transcript

Gert Gigerentzer, welcome to Learning Vision Statistics. I'm glad to be here. Yeah, thanks a lot for taking the time. I am very happy to have you on the show. A few patrons have asked for your episode, so I'm glad to have you here today. And thank you very much to all of you in the Slack, in the LBS Slack who recommended Gert for an episode on the show. And yeah, I have a lot of questions for you because you've done a lot of things.

You have a lot of, there is a lot of questions I want to ask you on a lot of different topics, but first, as usual, let's start with your origin story. Geert, and basically, how did you come to the world of study of rationality and decision-making under uncertainty? Now, I have been observing myself, how I make decisions. For instance, in an earlier career, I was a musician playing dixieland, jazz, and other things. And when I did my PhD work, I had to make a decision.

Was I want to continue a career on the stage as a musician or to try an academic career? Mm-hmm. And for me, music was the safe option, because I knew, and also I earned much more money than an assistant professor. And an academic career, I couldn't know whether I could make it, whether I would ever become a professor, but it was the risky option. So this is, if you want an initial story, I decided then to take the uncertainty at risk. That makes sense.

And so that was like pretty early in your career, or is that something that came later on when you already had started studying other things, or you started doing that as soon as you started your undergrad studies?

What came later was that I learned about theories about decision making, and some of them I found very unrealistic and strange, and about topics that were not really the topics where I thought are important, like which job do you take, what do you do with the rest of your life, but were of monetary gambles, was it you want a hundred dollars for sure, or two hundred with a probability of 0.4? or six.

And I also spent an important year of my life at the Center for Interdisciplinary Research in Bielefeld on a group called the Probabilistic Revolution. That's an international and interdisciplinary group that investigated how science changed from a deterministic worldview to a probabilistic one. And I learned so much. I was one of the young guys in this group. There were people like Thomas Kuhn, Ian Hacking, Nancy Cartwright. And that also taught me something.

It's important not to read in your own discipline and do what the others do. But to fall in love is a topic like decision making and uncertainty in the real world. And then read everything. that people have written about that. And that means from areas like biology, animal behavior, to economics, to sociology, to the history of science.

Yeah, that was something really interesting when preparing the episode with you to see the whole arc of your career being basically around these topics that you've studied really a lot and in-depth. So that was really super interesting to notice. And so something I'm wondering is, if you remember... how you first got introduced to Bayesian methods. Now, for instance, I read Fisher's book, Statistic Methods and Mm-hmm. Thomas Bayes for having the insight not to publishing his paper.

Because, according to Fisher, that's not what you need in science. And I got very much interested in the fights between statisticians, in something that could be called insult and injury. And Fisher, for instance, in the same book, he destroys Carl Pearson, his successor, saying the terrible weakness of his mathematical and scientific work flowed from his incapacity of self-criticism. So if you want to get anyone interested in statistics, then start with the controversies. That's my advice.

And the pity is that in the textbooks, in psychology certainly, All the controversies have been eliminated, one doesn't mention them, and talks as if there would be only one kind of statistics. So that could be Fisher's null hypothesis testing, which has been turned in a very strange ritual, Fisher never would accept, or on the other side there are also Bayesians who think it's the only tool in the toolbox. And the knees of that attitude is realistic, it's more religious.

There is a statistical toolbox. And there are different instruments and you need to look at the problem to choose the right one. And also within bass, there are so many different kinds of bassianism. There's not one. 64,000. It's a lot. Yeah, so, okay, that makes it clear. And that helps me also understand your work because, yeah, something I saw is in your work, you often emphasize the role of heuristics in decision-making.

So I'm curious if you could explain how Bayesian thinking and heuristics intersect and... how do these approaches complement each other in navigating uncertainty? First, the term heuristic is often misunderstood. I mean the term in the sense that Herbert Simon used it to make a computer program smart, or the Gestalt psychologist used it, or Einstein used it in the title of his Nobel Prize winning paper of 1905.

I don't use it in the sense that it has been very popular in psychology and other fields. as heuristics and biases. That's a clear misunderstanding. So to make it very short, in a world that Jimmy Savage, who is often called the father of Bayesian statistics, called a small world where the entire state space is known and nothing else can happen. In that world, This is the ideal world for Bayesianism and also for most of statistics.

In a world where you do not know the state space that the economist Frank Knight called uncertainty, or as I have called true uncertainty or radical uncertainty, you can't optimize by definition. You cannot find the best solution. And here... People and other animals, just like managers and scientists, use heuristics. So a heuristic is a rule that helps you, under uncertainty, to find a good solution. For instance, Polia, the mathematician distinguished between analysis and heuristics.

You need heuristics to find a proof and you need analysis to check. whether it was right. Most important, heuristics and analysis are not opposites, as it's now become very popular in system one and system two theories. They're not opposites. They go together. And for instance, a study of 17 noble laureates reported that almost all of them attributed there. success from going back and forth between heuristics slash intuition or analysis. So that's an important thing. It's not binary opposites.

So your question, where does Bayes meet heuristics? Now, of course, for instance, in the determination of the prior probability distribution, uniform That's also known as one over N. So you divide, for instance, your assets equally over the funds or the stocks that you have. It's a reasonable assumption when you know little. And just as one over n is reasonable, in some situations it's not always.

And the real challenge is to find out in what situation does a certain heuristic or does space work, and where does it not work. That's what I call the study of ecological rationality. So in short, there's no single tool that's always the best. We need to face... The difficult question, can we identify the structure of environments where a simple heuristic like equal distribution or imitate others works and where does it mislead?

Hehehe Yeah, yeah, this is really interesting because something also I'm always like, I always try to reconcile and actually you talk about it in your book, Gut Feelings, The Intelligence of the Unconscious. And you talk also about intuitions and how they can sometimes outperform more complex analytical processes. And this is a claim that you can see in a lot of fields, right?

From, I don't know, politics to medicine to sports, when basically people don't really want the analytical process to be taken too seriously because maybe it doesn't go, it doesn't confirm their... Yeah. their previous analysis or their own bias.

So what I'm wondering is how do Bayesian methods in your research, how do Bayesian methods accommodate the role of intuitive judgment and how can individuals strike a balance between intuitive thinking and the systematic updating of beliefs that we use under Bayesian reasoning? So let me first define what I mean by intuition.

So intuition is a kind of unconscious intelligence that is based on years of experience with a topic where one feels quickly what one should do, what one should not do, but one cannot explain it. So when a doctor sees a patient and the doctor may feel something is wrong with that patient but cannot explain it, that's an intuition based on years of experience. And then the doctor will go on and do tests and analysis in order to find out what's wrong if there's something.

So remember, intuition and analysis are the same. always go together. It's a big error what we have today in so-called dual processing theories, where they're presented as opposites. And then usually one side is always right, like analysis and intuition is blamed, and heuristics are blamed if things go wrong. I see. Yeah. And so how does that then integrate into the Bayesian framework according to you? Like in the systematic analysis of beliefs that we have in the Bayesian framework.

So applications of Bayes use heuristics such as 1 over n, so equal distribution, equal priors. And they also use a more silent independence assumption and such things. But I would not phrase the problem as how to integrate heuristics in the Bayesian framework. I would also not say... how to integrate Bayes in the heuristics framework.

I think of both, so there are many Bayesian methods and also other statistical methods, the old optimizing methods, and there are heuristic methods which are non-optimizing methods. I think of them as part of an adaptive toolbox that humans have, that they can use, and the real art is the choice of the right. tool.

So when I should use base and what kind of base or when should I use a heuristic, a social heuristic, for instance do what Alex tells me to do or for instance simple heuristics like take the best which just go lexicographically through reasons and stop with the first one that allows to make a decision. And that's the question of ecological rationality. I see. And do you have, yeah, do you have examples?

Bayes' rule is a rule that is reasonable to apply in situations where the world is stable, where no unexpected things happen, where you have good estimates for the priors and also good estimates for the likelihoods. For instance, mammography screening is a case. So... We know that the, or we can expect that the results of mammography screening won't change very much. We have to take in account that the base rates differ from country to country or from group to group.

But besides that, it is a good framework to understand what is the probability that a person has breast cancer. if she tests positive. Mm-hmm. But that's a good situation. But if you have something which is highly volatile, like, okay, I worked with the Bank of England on a method for regulation, for banking regulation, and that role is highly volatile, and you're not getting very far with standard statistical methods. But you may evaluate whether a bank is in troubles.

by something that we call a fast and frugal tree that only looks at maybe three or four important variables and doesn't combine them in a way as base or as linear models do, but lexicographic. Why? Because, so if you first look, for instance, think about medical diagnosis. If your heart fails, a good kidney cannot compensate that. And this is the idea of lexicographic models. And a number of heuristics are lexicographic, as opposed to compensatory models like Bayes or linear regressions.

Oh, I see, okay. Yeah, continue. Yeah, I have myself trained about a thousand doctors in understanding and doing Bayesian diagnosis and Bayesian thinking. And you should realize that most doctors and also most gynecologists would not be able to answer the question I posed before. What is the... probability that a woman has breast cancer in screening when the mammogram is positive. And if I give them the numbers in conditional probabilities, they're equally lost. Alex, I do a test with you.

Are you ready? So the point will be, I give you the information in, as usual, in conditional probabilities. And I hope you will be confused. And also, to readers, the listeners. And then I give you the same. information in what we call natural frequencies. And then insight will come. Ready? Okay. So assume you conduct a mammography screening. What you know is that among the group of women who participates, there is a one percent chance that a woman has breast cancer undetected.

You also know that the probability that a woman has positive if she as breast cancer is 90%. And you know that the probability that women should test positive if she does not have breast cancer is 9%. Okay? You have a base rate of 1%, a sensitivity or hit rate of 90%, and a falls alarm rate of 9%. Now a woman in that group just tested positive and you know nothing. about her because it's creamy, ask you, doctor, tell me, do I now have breast cancer? Or how certain is it?

99%, 90, 50, please tell me. What do you say? If there is now fog in your mind, that's the typical situation of most doctors. Mm-hmm. And there have been conclusions made in psychological research that the human mind has not evolved to think statistically, or here, the Bayesian way. Now the problem is not in the mind, the problem is in the representation of the information. Conditional probabilities are something quite new. And few of us have been trained in it.

Now how did humans... before Thomas Bass. Mm-hmm. or animals do based on reasoning, not conditional probabilities, but what we call natural frequencies. That is, I give you first a demonstration, then explain what it is. Okay, we use the same situation. You do the mammography screening and translate the probabilities into concrete frequencies. Okay? Think about a hundred women. We expected one of them has breast cancer and she likely tests positive. That's the 90%.

Among the 99 who do not have breast cancer, we expected another 9 will nevertheless test positive. So we have a total of 10 who test positive. Question, how many of them do actually have cancer? It's one out of 10. So a woman who tests positive in screening has most likely not cancer. That's good news. So that's natural frequencies and you basically see through. And natural frequencies, we call them because they're not relative frequencies. They're not normalized.

You start with a group like 100 and you just break it down. And then the computation becomes very simple. just imagine Bayes rule for this problem. And then natural frequencies does the computation, the representation. It's just one out of the total number of positives, 10. That's all. And once doctors have learned that and tried with a few problems, they can generalize it and use the method for other problems. And then we can avoid.

the errors that are currently still in place and also doctors can better understand what tests like HIV tests or pregnancy tests actually mean. And the interesting theoretical point is, as Herbert Simon said, the solution to the problem is in its representation. And he asked it from the Gestalt Psychologist. Yeah, this is really interesting. I really love the... And in a way that's quite simple, right, to just turn to natural frequencies.

So I really love that because it gives a simple solution to a problem that is indeed quite pronounced, right? Where it's just like when you're... Even if you're trained in statistics, you have to make the conscious effort of not falling into the fallacy of... thinking, well, if the woman has a positive test and the test has a 99% hit rate, she's got a 99% probability of having breast cancer.

I have one part of my brain which knows that completely because I deal with statistics all the time, but there is still the intuitive part of my brain, which is like, wait, why should I even wonder if that's the true answer? So I like the fact that natural frequencies kind of an elegant and simple solution to that issue. And so I will put in the show notes your paper about natural frequencies and also the one you've written about HIV screening and how that relates to natural frequencies.

So that's in the show notes for listeners. And I'm also curious, basically concretely, how you did that with the professionals you've collaborated with. Because your work has involved collaborating with professionals from various domains. That means physicians, that means judges. I'm curious how you have applied these principles of risk communication in practice with these professionals and what challenges. and what successes have emerged from these applications.

Yeah, so I have always tried to connect my theoretical work with practical work. So in that case of the doctors, I have been teaching continuing medical education for doctors. So the courses that I give, they are certified and the doctors gets points for that. and it may be a group of 150 or so doctors who are assembled to a day or two days of continuing medical education, and I may do two hours with them.

And that has been for me a quite satisfying experience because the doctors are grateful because they have muddled through these things for their lives. And now they realize there's a simple solution. They can learn within a half an hour or so. And then it sticks for the rest of their lives. I've also trained in the US, so I have lived many years in the US and taught as a professor at the University of Chicago.

And I have trained together with a program from George Mason University, US Federal Churches. These are very smart people and I enjoyed that. So these trainings were... and in illustrious places like Santa Fe. And the churches were included and their partners also included. And there was also a series of things like about how to understand fibers. And I was teaching them how to understand risks and decision making and heuristics. And...

If you think that federal churches who are among the best ones in the US would understand Bayes' rule, good luck. No, there may be a few, most not. And actually, by the way, Bayes' rule is forbidden in UK law. interesting. And so, but going back, these are examples of training that every psychologist could do. But you have to leave your lab and go outside and talk to doctors and have something to offer them for teaching.

By now, the term natural frequencies is a standard term in evidence-based medicine. And I'm very... proud about that. And many, there's also a review, a Cochrane's review has looked at various representations and found that natural frequencies are among the most powerful ones. And we have with some of our own students who were more interested in children than in doctors, we have posed us the question, can we teach children? and how early.

And one of the papers I sent you, it's a paper in the Journal of Experimental Psychology General, I think two years ago, has for the first time tested fourth graders, fifth graders, sixth graders, and second graders. So when we did this with the teachers, they were saying, and they were looking at the problems, They were saying, no, that's much too difficult. The children will not be able to do that. They haven't even had fractions. But you don't need fractions.

And for instance, when we use problems, they are more childlike. So here we put that type of problems. And when they are in natural frequencies, And the numbers are two-digit numbers. You can't do larger numbers with fourth graders. Then the majority of the fourth graders got the exact Bayesian answer. Of course, with conditional probabilism, it would be totally lost. And also we have found that some, maybe 20% of the second graders find the Bayesian answer.

The title of the paper is Our Children Intuitive Basients. Yeah, it's in the show notes. And again, it's in the representation. It's a channel message in mathematics, that representation of numbers matter. And if you don't believe it, just think about doing a calculation or base rule with Roman numerals. Good luck. And that's well known in mathematics. For instance, the physicist...

Feynman has made a point that mathematically equally forms of a formula, or despite their mathematically equivalent, they're not psychologically the same. Because, as I said, you can see new directions, new guesses, new theories. In psychology, that is not always realized. And what Feynman, Richard Feynman was talking about would be called framing in psychology. And by many of my colleagues, it's considered an error to pay attention to framing. It's not.

It's an enabler for intelligent decision-making. Yeah, this is fascinating. I really love that. And I really recommend your, your paper that you that you're talking about. Do children have Bayesian intuitions? Because first, I really love the experiment. I found that super, super interesting to watch that.

And also, yeah, as you were saying, in a way, the conclusion that we can draw from that and basically how this could be integrated into how statistics education is done, I think is extremely important. And actually, yeah, I wanted to ask you about that. Basically, if you, what would be the main thing you would change in the way statistical education is done?

Well, so you're mainly based in Germany, so I would ask in Germany, maybe just in general in Europe, since our countries are pretty close on a lot of metrics. So I guess what you're saying for Germany could also be applied for a lot of other European countries. it's actually starting to change. So some of my former post-docs are now professors, and some are in education.

And for instance, they have done experiments in schools in Bavaria, where the textbooks have, in the 11th class, have base rule. And they show trees, but with relative frequencies. not natural frequencies. And I've run a study which basically showed that when pupils learn in these textbooks base rules with relative frequencies or conditional probabilities, and you test them later, 90% can't do it anymore. They've done something like rote learning. Never understood it.

And then, in class, teachers taught the students natural frequencies they had never learned before. And then 90% could do it. Something they had never heard of. Thanks for watching! so my former students convinced the Bavarian government with this study. And now natural frequencies and thus understandable base is part of the mass curriculum in Bavaria. So that's a very concrete example where one can help young persons to understand.

And when they will be older and will be doctors or have another profession where they need base, they will not be so blocked and have to muddle through and not understand. And if there are patients, then they know what to ask and how to find out what a positive HIV screening test really means or a positive COVID test and what information one needs for that. So I think that statistical literacy is one of the most important topics that should be taught in school.

We still have an emphasis on the mathematics on certainty, of certainty. So algebra, geometry, trigonometry, beautiful systems. But what's most important for everyone in later life is not geometry, it's statistical thinking. I mean in practical life. And we are missing to do that. The result is that... If you test people, including medical professionals, or we have tested professional lawyers, with problems that require Bayesian thinking, most are lost.

And the level of statistical thinking is... is often so low that you really can't imagine it. Here's an example. Two years ago, the Royal Statistical Society of London asked members of parliament whether they would be willing to do a simple statistical test. And about 100 agreed. The first question was, if you throw a fair coin twice, what's the chance that it will land twice on head?

Now, if you think that every member of parliament understands that there are four possibilities and two heads or two... So two heads are, that's one in fourth? No. About half understood and the others not. And the most wrong guess was it's still a half. It's just an illustration of the level of statistical thinking in our society. And I don't think if we would test German politicians, we would do much better. And that's a, you might say, yeah, who cares about coins?

But look, there was COVID with all these probabilities. There is investment. There are taxes. There are tons of numbers that need to be understood. And if you have politicians that don't even understand the most basic things, what can we expect? No, for sure. I completely agree. And these are topics we already tackled in these podcasts, especially in episode 50, where I had David Spiegelhalter here on the podcast.

And we talked about these topics of communication of uncertainty and all these very interesting topics, especially education and how how to include all that in the education. So that these are very interesting and important topics and I encourage people to listen to that episode, number 50 with David Spiegelhauter. I will put it in the show notes. Yeah. I may add here that David and I have been working together for many years.

And he has been conducting the Wynton Center for Evidence Communication or Risk Communication in Cambridge. And I'm still directing the Harding Center for Risk Literacy. And both centers were funded by the same person, David Harding, a London Investment Banker, who had insight that there's a problem. But the rest of philanthropists don't really seem to realize that it would be important to fund these centers. The Wyndham Center is now closed down. which is a great pity. And yeah.

So there's very little funding for. So there's funding for research. So when I do the studies like this, children, there's lots of funding for that. But the moment you apply what you learn into the real world to help the society, funding stops. Except for... Philanthropes like David Harding. Mm-hmm. Any idea why that would be the case? They are the research agencies they think they have not realized the problem that science is more than having publications.

but that much of the science that we have is actually useful. That's being realized in, if it's about engineering, and it's about patent, yes, but that there are similar positive tools that help people like natural frequencies to understand their world, and that you can teach them, and then you need a few. guys who just go out and teach doctors, lawyers or school children. That is not really in the mind of politicians. Yeah, which is, which clearly is a shame, right?

Because you can see how important probabilistic thinking is in a lot of, in a lot of fields. And, and, and especially in politics, right? Even electoral forecasting, which is something I've done a lot. Probabilistic thinking is absolutely, absolutely of utmost importance. And yet, it's not there yet. and not a lot of interest in developing this, at least in France, which is where I have done these experiments. That's always been puzzling to me, actually.

And even in sports, one of the recent episodes I've done about soccer analytics with Maximilian Goebel, well, That was also an interesting conversation about the fact that basically the methods are there to use the data more efficiently, but a lot of European football clubs don't really use them for some reason, which for me is still a mystery because that would help them make better use of their finite resources and also be more competitive.

So. Yeah, that's definitely something I'm passionate to understand. So yeah, thanks a lot for doing all that work. I'm here to try and help us understand all that. everyone can help here. And for instance, most people are with the doctors at some point, like COVID-19 or HIV tests or cancer screening. And everyone could ask the doctor, what's the probability that I actually have the disease? or the virus, if it is positive. And then you likely will learn that your doctor doesn't know that.

Or excuse. Then you can help your doctor understand that. And bring a natural frequency tree and show them. I've done this with many doctors, but quite a few. Over here, I said, I'm training doctors. I've trained more than 1,000, my own researcher from the Harding Center, I've trained more than 5,000 extra. And the last time I was with my home physician, I spent maybe 50 minutes with him. and 40 minutes explaining him on the internet where he finds reliable information.

The problem is not in the doctor's mind, the problem is in the education, at the medical departments, where doctors learn lots of things, but one thing they do not learn, statistical thinking. Mm-hmm. Yeah. with very few exceptions.

And I'm curious, did you do some follow-up studies on some courts of those doctors where you basically taught them those tools, it seemed to work in the moment when they applied it, and then I'm curious basically of the retention rate of these methods, basically is it something like, oh yeah, when you force them in a way to use them, yeah, they see it's useful, that's good. But then when you go away, they just don't use them anymore.

And they just refer to the previous way they were doing things, which is of course, suboptimal. So yeah, I'm curious how that... continuing medley education, I have about 90 minutes and I teach them many things, not just natural frequencies. And when I teach them natural frequencies, somewhere in the beginning, and I test them towards the end. So that's, yeah, a short time, a little bit more than an hour. There is no way for me to find these doctors again.

But we have done follow-up studies up to three months with students and teaching them how to translate conditional probabilities in natural frequencies. And the interesting thing is that the performance, which is after the training, around 90%, that means 90% of all tasks, they get exactly right. After several months it stays at the same level. Whereas in the control group where they are taught conditional probability, exactly your problem is there.

So they learn it not as well as natural frequencies, but then a few days later it goes away and after three months they are basically down with the story. Yeah. Some representations do not stick in the minds. And frequency representations do, if they are not relative frequencies. Yeah, this is definitely super interesting. So basically to make it stick more, the idea would be definitely use more natural frequencies. Is that what you were saying?

Yes, and of course it doesn't hurt if you continue thinking this way and do some exercise. Hmm, yeah. Yeah, yeah. I see. And something I'm also curious about and that a lot of, a lot of beginners ask me a lot is what about priors, right? So I'm curious in your job, how did you handle priors and the challenges regarding confirmation bias, persistence of... persistence of incorrect beliefs.

So in a more general way, what I'm asking is, how can individuals, particularly decision makers in fields like law or medicine that you know very well, avoid the pitfalls associated with biased prior beliefs and harnessing the power of patient reasoning? Yeah, so in the medical domain, particularly in diagnostics, the priors are usually from, they're usually frequencies and they are estimated by studies.

There's always the possibility that a doctor might adjust the frequency base rate a bit because he or she has some kind of belief that this patient's main op-e exactly from that group. But again, there's huge uncertainty about priors. And also, one should not forget, there's also uncertainty about likelihoods. Often in Beijing, the discussion centers among priors. How do you know the likelihoods?

So for instance, the, take the mammography problem again, the probability that you test positive if you don't have cancer, so which I in the example gave is 9%, which is roughly correct, but it varies. It depends on the age of the woman. It depends on quite a number of factors. And one should not forget that Also the likelihoods have to have some kind of subjective element and judgment.

And then there's a third more general assumption, namely the assumption that all these terms, the likelihoods and the base rates, which are from somewhere, maybe a study in Boston, would actually apply to a study in Berlin. Mm-hmm. And I can name you a few more assumptions. For instance, that the world would be stable, that nothing has happened. There's no different kind of cancer that has different statistics. So one always has to assume a stable world to do base.

And one should be aware that it might not be. And that's why I use the term statistical thinking. Because you need to think about the assumptions all the time and about the uncertainty in the assumptions. And also realize that often, particularly if you have more complex problems, not just one test, but many, and many other variables, you might, in these situations, where Bayes slowly gets intractable. Mm-hmm.

You might think using a different representation, like what we call a fast and frugal tree, that's a simple way. It's just like think about a natural frequency tree, but it is an incomplete one, where you basically focus on the important parts of the information and don't even try to estimate the rest in order to avoid estimation error. And that's the key logic of heuristics. Under uncertainty, the big danger is that you overfit. You overfit the data.

You have wrongly assuming that the future is like the past. And in order to avoid overfitting, as the bias-variance dilemma shows in more detail, one needs to make things more simple. Maybe not too simple, but more simple. and trying to estimate all conditional probabilities may give you a great fit, but not good predictions. Yeah, so thanks a lot for this perfect segue to my next question, because this is a recurring theme in your work and in your research, simplicity.

You often emphasize simplicity in decision-making strategies. And so that was something I was wondering about, because, well, I, of course, love Bayesian methods. They are extremely powerful. They are, most of the time, really intuitive to interpret, especially the model parameters. But they are complex sometimes. And they appear even more complex than they are to people who are unfamiliar with them, precisely because they are unfamiliar with them.

So anything you're unfamiliar with seems extremely complex. So I'm wondering how we can bridge the gap between the complexity of patient statistics, whether real or fantasized, and the need for simplicity in practical decision-making tools, as you were talking about, especially for professionals and the general public, because these are the audiences we're talking about here. Now there are two ways.

One is you stay within the Bayesian framework and for instance avoid estimating conditional probabilities. And that would be what's called naive Bayes. And naive Bayes can be amazingly good. It has also the advantage that is much more easy to understand than regular Bayes. The second option is to leave the Bayesian framework. and study how adaptive heuristics can give you what base makes too complicated. And also there's too much overfitting.

For instance, if we have studied investment problems, so assume you have a sum of money and want to invest it in N assets. How do you do it? And there are basic methods that tell you how to weigh your money in each of these in assets. There is Markowitz Nobel Prize winning method that's standards of statistics, the mean variance portfolio that tells you how you should do that. But when Harry Markowitz made his own investments for the time after his retirement...

You might think he used his Nobel Prize winning optimization method. No, he didn't. He used a simple heuristic that's called 1 over n, or divide equally, the same as a Bayesian equal prior. And a number of studies have asked how good is 1 over n compared to the Nobel Prize? Winning Markowitz model and also modern variants including Bayesian methods.

The short answer is that 1 over n is mostly as good as Markowitz and also better, and also the most modern sophisticated models that use any kind of complexity cannot really beat it. The more interesting question is the following. Can we identify in what situation A heuristic like 1 over n or any other of the complicated models is ecologically rational. Because before we have talked about averages. And you can see, so 1 over n has no free parameter, very different from base.

That means nothing needs to be estimated from data. It actually doesn't need any data. Thus, in the statistical terms of bias and variance, it may have a bias, and likely it has. So bias is the difference from the average investment to the true situation, but it has no variance because it doesn't estimate any parameters from data. And variance means it's the deviation. of individual estimates from different samples around the average estimate.

And since there is no estimate, there is no variance. So Markowitz or Bayesian models, they suffer from both errors. And the real question is whether the sum of bias and variance of one method is larger than of the other one. And then ecologically rational it means, let me illustrate this with the, with Markowitz versus Van der Weyne. So if you have more, if n is larger, then you have more parameters to estimate because the covariances, they just increase. That means more measurement error.

So you can... derived from that, that in situations where we have a large number of assets, then the complex methods will likely not be as good. While 1 over n doesn't have more estimation error, it has none anyhow. And then another thing is, if the true distribution of the so-called optimal weights that you only can know in the future, is highly skewed. Then 1 over n is not a good model for that. But it's roughly equal, then that's the case.

So these are, and then sample size plays a role for the estimation. So the more data you have, the Bayesian or Markowitz model will profit, while it doesn't matter. for the 1 over n heuristic because it doesn't even look at the data. So that's the kind of ecological rationality thinking. And there are some estimates just to give you some flesh into that.

One study has asked, one study that found that mostly in seven out of eight, I think, tests 1 over n made more money in terms of Sharpe ratio and similar. criteria than the optimal Markowitz portfolio and with 10 years of data. So they asked the question how many years of data would one need so that the estimates get precise so that eventually the complex model outperforms the simple heuristic. And that depends on the number of assets you have.

And if they are 50, for instance, then the estimate is you need 500 years of stock data. So in the year 2500, we can turn to the complex models, provided the same stocks are still around in the stock market in the first place. That's a very different way to think about a situation. It's the Herbert Simonian way, or don't think about a method by itself, and don't ever believe that a method is rational in every situation. But think about how this method matches with the structure of environment.

And that's a much more difficult question to answer than just claiming that something is optimal. Yeah, I see. That's interesting. I love the very practical aspect of that. And also that, I mean, in a way that focus on simplicity is something I found also very important in the way of basically thinking about parsimony. Why make something more difficult when you don't have to? And it's something that I always use also in my teaching, where I teach how to build a model.

Don't start with the hierarchical time series model, but start with a really simple linear regression, which is just one predictor, maybe. And don't make it hierarchical yet, even though that makes sense.

the problem at hand because from a very practical standpoint if the model fails and it will at first if it's too complex you will not know which part to take apart right and to and to make better so it's just the parsimony makes it way easier to build the model and also to choose the prior right just don't make your priors turn complicated find good enough priors because you won't find Find good enough priors and then go with that. I mean, the often use of the term optimal is mostly misleading.

Under uncertainty or interactability, you cannot find the optimal solution and prove it. It's an illusion. And under uncertainty, so when you have to make predictions, for instance, about the future and you don't know whether the future is like the past, quite simple heuristics outperform highly complex methods. An example is, remember when Google engineers try to predict the flu with a system that's called Google Flu Trends.

and it was a secret system and it started with 45 variables, they were also secret, and the algorithm was secret. And it ran from 2008 till 2015. And at the very beginning in 2009 the swine flu occurred. And out of season in the summer. And Google flew trends, so the big data algorithm had learned that the flu is high in the winter and low in the summer. So it underestimated the flu-related doctor visits, which was the criterion.

And the Google engineers then tried to revise the algorithm to make it better. And here are two choices. One is what I call the complexity illusion, namely you have a complex algorithm and the high uncertainty, like the flu is a virus that mutates very quickly, and it doesn't work. What do you do now? You make it more complex. And that's what the Google engineers did. So they used a revision with about 160 variables, also secret.

and thought they would solve the problem, but it didn't improve at all. The opposite reaction would have been... You have a complex and high uncertain problem. You have a complex algorithm. It doesn't work. What do you do now? You make it simpler. Because you have too much estimation error. The future isn't like the past. We have tested those published paper on a very simple heuristic that just takes one data point.

So remember that. Google Flu Trends estimated next week's or this week's flu-related doctor visits. So the one data point algorithm is you take the most recent data, it's usually one week or two weeks in the past, and then make the simple prediction that's what it will be this or next week. That's a heuristic called the recency heuristic, which is well documented in human thinking, is often mistaken as a bias heuristic. And we showed it for the entire run of Google Flu Trends for eight years.

The simple heuristic outperformed Google Flu Tense in all updates, about a total, I think, three updates. for every year and for each of the updates and reduce the error by about half. You can intuitively see that. So a big data algorithm gets stuck like if something unexpected happened like in the swine flu. The recency heuristic can quickly adapt to the new situation and So that's another example showing that you always should test a simple algorithm first.

And you can learn from the human brain. So the heuristics we use are not what the heuristics and bios people think, always second best. No. You need to see in a situation of high uncertainty. Pick a right heuristic. A way to find it is to study what humans do in these situations. I call this psychological AI. Yeah, I love that. Um, and actually that, so before closing up the show that, um, sets us up nicely for one of my last questions, which is a bit more, uh, formal thinking.

Because so you, you've been talking about AI and, and these decision-making science. So I'm wondering how you see the future of decision science. And where do vision statistics fit into this evolving landscape, especially considering the increasing availability of data and computational power? And that may be related to your latest book.

Yeah. My latest book is about, it's called How to Stay Smart in a Smart World, and it teaches one thing, a distinction between stable worlds and unstable worlds. Stable worlds are like what the economist Frank Knight called a situation of risk, where you can calculate the risk as opposed to uncertainty. That's unstable worlds. If you have a stable world, That's the world of optimization algorithms, at least if it's fractable. And here more data helps, because you can fine-tune your parameters.

If you have to deal with an unstable world, and that's most of things are unstable, are not just viruses, but human behavior. And complex algorithms typically do not help in predicting human behavior. In my book I have a number of examples. And here you need to study smart adaptive heuristics that help. And for instance, we are working with the largest credit rating company in Germany. And they have... intransparent, secret, complex algorithms.

That has caused an outcry in the public because these are decisions that decide whether you are considered for, if you want to rent a flat or not, and other things. And we have shown them that if they make the algorithms simpler. then they actually get better and more transparent. And that's an interesting combination. Here is one future about solving the so-called XAI problem. First try a simple heuristic, that means a simple algorithm, and see how good it is.

And not just test competitively, a handful of complex algorithms. Because the simple algorithm may be do as well or better than the complex ones. And also they are transparent. And that means that doctors, for instance, may accept an algorithm because they understand it. And a responsible doctor would not really want to have a neural network diagnostic system that he or she doesn't understand. So the future of decision making would be, if you want it in a few sentences, take uncertainty serious.

and distinguish it from situations of risk. We are not foreign, I hear this. And second, take heuristics seriously and don't confuse them with viruses. And third... If you can, go out in the real world and study decision making there. How firefighters like Gary Klein make decisions, how chess masters make decisions, how scientists come up with their theories. And you will find that standard decision theory that's geared on small worlds of calculated risk will have little to tell you about that.

and then have the courage to study empirically what experience people do, how to model this as heuristics and find out their ecological rationality. That's what I see will be the future. Nice. Yeah, I find that super interesting in the sense that it's also something I can see as an attractive feature of the patient modeling framework from people coming to us for consulting or education, where the fact that the models are clear on the assumptions.

and the priors and the structure of the model make them much more interpretable. And so way less black boxy than classic AI models. And that's, yeah, definitely a trend we see and it's also related to causal inference. People most of the time wanna know if X influences Y and in what way, and if that is, you know, predictable way. And so for that causal inference, fits extremely well in the Bayesian framework.

So that's also something I'm really curious about to see evolve in the coming years, especially with some new tools that start to appear. Like I had Ben Vincent lately on the show for episode 97, and we talked about causal pi and how to do causal inference in PyMC. And now we have the new do operator. in Pintsy, which helps you do that. So, yeah, I really love seeing all those tools coming together to help people do more causal inference and also more state of the art causal inference.

And for the curious, we will do with Benjamin Vincent a modeling webinar in the coming weeks, probably in September, where he will demonstrate how to use the Dooperator in PIMC. So if you're curious about that, follow the show. And if you are a patron of the show, you will get early access to the recording. So if you want to support the show with... Cafe latte per month. Um, I, uh, I'm really, um, uh, thanking you from the bottom of my heart.

Um, well, Gert, um, I have so many other questions, but I think, I think it's a good time to, to stop. Uh, I've already taken a lot of your time, so I want to be mindful of that. Um, but before letting you go. I'm going to ask you the last two questions I ask every guest at the end of the show. Number one, if you had unlimited time and resources, which problem would you try to solve?

I would try to solve the problem to understanding the ecological rationality of strategies, particular heuristics. Hmm. That's a next. Yeah. You're the first one to answer that. And that's a very precise answer. I am absolutely impressed. And second question, if you could have dinner with any great scientific mind, dead, alive, or fictional, who would it be? Oh, I would love to have dinner with two women. The first one is a pioneer of computers, Ada Lovelace.

And the second one is a woman of courage and brain, Marie Curie. The only woman who got two Nobel Prizes. And Marie Curie said something very interesting. Nothing in life. is to be feared. It is only to be understood. Now is the time to understand more so that we may fear less." Kori said this when she discovered that she had cancer and was soon to die. extremely inspiring. Yeah, thanks, Edgar. That's really inspiring. But having courage is something that's very important for every researcher.

And also having courage to look forward, to dare, to find new avenues, rather than playing the game of the time. Well, on that note, I think, well, thank you for coming on the show, Gert. That was an absolute pleasure. I'm really happy that we could have that more, let's say epistemological discussion than we're used to on the podcast. I love doing that from time to time. Also filled with applications and encourage people to take a look at the show notes. I put.

your books over there, some of your papers, a lot of resources for those who want to dig deeper. So thank you again, Gert, for taking the time and being on this show. It was my pleasure. Bye bye.

Transcript source: Provided by creator in RSS feed: download file