¶ Science Never Proves Things
Hello, everyone, and welcome to the Mindscape Podcast. I'm your host, Sean Carroll. One of the things that I always like to say about science and how it gets done is that science never proves things. This is something that is an important... feature of science, especially in the modern world, where what science does, how it reaches conclusions, how trustworthy it is, these are all under contestation.
by different parts of society. So it's important to understand what science is and how it actually reaches its conclusions. And the claim that science never proves things, which is something that most scientists would go along with me on.
comes from a comparison to real proof in mathematics or, for that matter, in logic. You know, most scientists have taken some math classes, at least enough... to know what it means to prove something in the old-fashioned sense of Euclid and geometry or Aristotle and logic, proving a conclusion from some well-articulated premises.
¶ Deductive vs Inductive Reasoning
The philosophical study of logic, this is known as deductive reasoning. You have some premises and you reach a conclusion. And science just doesn't go that way, right? Science looks at the world, it looks at all sorts of things in the world, and it tries to...
to figure out what the patterns are that the world follows, always knowing that tomorrow you might do a new experiment that will overturn your best guess as to what the pattern was, or maybe someone will do something as simple as just thinking of a better pattern. right? A theoretical physicist coming up with a better idea for what the laws of physics really are. So if science doesn't prove things, if it just sort of comes closer and closer in some sense to getting it right, then what is...
What's going on? One very common idea about what's going on is inductive logic rather than deductive logic. In inductive logic, we begin to see a pattern. You know, A, B, C, D, E, F, G. The next one is probably going to be H, right? Because we think that probably you're just mentioning the alphabet in alphabetical order. But...
There's all sorts of paradoxes that come up when you do inductive logic. Like, how do you know that it's not A, B, C, D, E, F, Z? That's a sequence of letters that you could have. My old math teacher in college...
Used to hate those SAT questions or standardized test questions that would give you a series of numbers and ask you to guess the next one. Because he said, I can make any number I want. I can come up with a formula that would give you any number I want after the ones that you already showed me.
¶ Philosophy of Induction and Probability
So philosophers unsurprisingly are very interested in making as rigorous and careful as possible this idea of either induction or whatever should replace induction as the logic of understanding things in science. The names attached here go from old school names like David Hume and John Stuart Mill to relatively newer ones like Rudolf Carnap, Carl Hempel, Carl Popper, for example. And it's still annoying.
ongoing thing. So this is something that lives at the intersection of how we think about science, but also how we think about probability. What probability is, how... conditional probabilities work, Bayesian logic, all that stuff. And that's what we're going to be talking about today. Today's guest is Brandon Feidelson, who is a philosopher at Northeastern University. And it's eye-opening to me.
as someone who is now part-time in a philosophy department, just the huge range of stuff that gets characterized as philosophy, right? Like some philosophers are saying... what is the good? Others are saying, what happens when an observation is made in quantum mechanics? And others are like doing pretty hardcore mathy logic. And well, there's mathematical logic, but there's also just... Big picture questions about logic. How does logic...
How do probability, how do these things get used both in a perfectly rational world and also in the slightly irrational world in which we live? We're going to be talking about both of those things. I think it's intrinsically interesting to understand probability and logic. better, but also super important to thinking about how science works. So let's go.
Brendan Fidelson, welcome to the Mindscape Podcast. Thank you so much for having me, Sean. I'm a big fan. I think what you're doing here is super important, especially nowadays. I hope you still think those things after we are done talking, but I hope to make it true. So we're talking about stuff that is dear to my heart. We're talking about...
increasing the probability that something is right. We're talking about what probability is, how it fits in with learning about things. Obviously, science cares about this a lot. So let's start at the very high level, and you tell me what probability really is.
¶ Kinds of Probability
What probability really is? Well, there's many kinds of probabilities. So there's probabilities in science. So, for instance, biology has its own conception of probability, which shows up in... the theory of natural selection, especially in genetics. Physics, of course, as you know better than I, has lots to say about probability, both in classical and quantum physics.
Yeah. And in economics, we also use probability and in many other special sciences. My interest in probability, I started out as a physicist, so I started out interested in probability and physics. That's how I got into it. Over the years I became more and more interested in the role probability plays in thinking about evidence and How strong arguments are that is how strong something is as a reason for believing something else
¶ Assessing Argument Strength
And that's kind of the application of probability that I'm most interested in nowadays. And in that context, there's still many kinds of probabilities because when you're assessing strength of an argument it really depends on the context so if you're playing a game of chance say and you're you know like poker and a certain card comes up and you're wondering well what effect does that have on my probability of winning this hand
Well, now you know what probabilities to use. They're given by, you know, the probabilities of a game of chance. Each card is equally likely to be drawn. And that allows you then to calculate the probability of any.
hand given you know what's left in the deck and so on and so there it's very clear what probabilities to use to assess the strength of that as a reason for believing say that you'll win or lose the hand In other contexts, it's much more difficult to say which probabilities are the appropriate
ones so for instance if we're wondering whether a certain scientific theory is true in fact we might even be worried about whether a certain scientific theory of probability is true And you might have two competing theories of what probability in a certain scientific context is like.
Well, there, how are you going to adjudicate how strong the arguments are? Well, you can't. It would be question begging to assume a notion of probability that, say, one of the theories adopts, but the other rejects. That would just be question begging. So what you need is some more general notion of probability.
that will allow you to evaluate arguments, even in those contexts. And as a philosopher, I want to go even more broad than that. I want to be able to assess arguments for the existence of God, maybe. for ethical claims and so on. And as you get more and more abstract and these contexts get further and further away, let's say from games of chance, which is kind of the easiest case, it gets more and more controversial. What kinds of probabilities?
are the relevant ones but you know i think of this like any other science sean i think Probability theory, it's a theory. And then what you do when you're faced with a certain situation is you have to construct models of the theory. And OK, that's a very complicated process, which involves making all kinds of assumptions and idealizations. And that's OK. And the goal there is to try to come up with the best account of which probabilities we should use so that we're.
adjudicating this question of how strong the arguments are in a way that's fair and reasonable. And that's really a case-by-case thing, I think.
¶ Problems with Frequency Theory
So I was just a couple of weeks ago at the retirement celebration conference for Barry Lower, former Mindscape guest. And there was a talk by David Albert, former Mindscape guest. And in part in... response to things that I and others have been saying about quantum mechanics and self-locating probabilities. And David, he was just...
unapologetically old school about it. He says the only sensible use of probability is when you have a frequency of something happening over and over again. And you can sort of imagine taking a limit of it happening infinite number of times and the ratio of the number of times where it looks like...
like X rather than Y, that's the probability. And, you know, I tried to say, but OK, come on. We certainly use probability in a much broader sense than that. We talk about the probability of a sports team winning a thing, even though we're not going to do it an infinite number of times. or even twice. So is there a consensus about this very basic question about the relationship between frequencies and probabilities versus just a more epistemic, like this is my best guess kind of thing?
Right. Okay, good. Yeah. So there's lots of, you know, what used to be called interpretations of probability, but I would just call them theories of probability. As I say, there are many. The frequency theory, well, it's a very strange theory, actually. I mean, it started off as an... finite frequency theory where you know
the probability of some event is actually just given by the actual frequency of some event in some population. So for instance, suppose you have a coin and it's been tossed exactly five times, three heads and two tails, and then it's destroyed. Well, according to the actual frequency theory, the probability is, you know, three-fifths that it's heads. Wow.
And in fact, if there's any odd number of tosses, actual odd number of tosses, then you can't get one half. So even if you have a fair coin, if it's tossed an odd number of times, then according to the actual frequency view, the probability, it isn't fair.
So the actual Freezy View was a non-starter, right? That's not going to work. I hope so, yeah. Also, you can't get irrational values. You can only get rational values, and that seems wrong because physics has all kinds of irrational values, probably. Okay, so then people said, well, okay, maybe what we'll do is we'll talk about hypothetical infinite extensions of the actual experiment.
Okay, well, what does that mean? They say things like, well, it's what would have happened had you continued indefinitely that initial sequence of five tosses. And I want to say, well, that's very hard to understand because there's uncountably many such extensions. And on almost all of them, there's no limiting frequency.
So it's true that for any real number, you can get that as the limiting frequency of an infinite sequence. But it's also true that almost all of the sequences don't have. They diverge in there. in their limiting frequency. Sorry, this sounds like you're paraphrasing some technical result with the use of the idea of almost all. That's a technical math term.
Yes, that's right. I just mean, well, it's not all. There's like a relatively small number of sequences that will converge. But it's sort of like... It's sort of like if you pick a real number at random, it's like, what are the chances of getting a rational number? Pretty small. Most of them are not rational by any reasonable measure of most.
And the same thing is true here. You have all these sequences. Well, which one? And so then you've got to say, well, which hypothetical infinite extensions are the ones that actually give you the real probability? And I just think this is just the wrong way to go.
¶ Probability Is What Theory Says
My view is I like to make an analogy with measurement in general, say in physics. So you might think, you might ask yourself, what is mass? Say, suppose just for the sake of argument that we're in a Newtonian universe and mass just behaves the way Newton thought it did, just for sake of argument. And then you think, well, what is mass anyway?
Well, my view about what mass is in such a universe is it's whatever the theory says it is. It's the functional role played by that concept and all the laws. And that's a very complicated thing. There's no easy way to summarize. It's just whatever Newton's theory says it is. But you might. be tempted by a different view you might think well wait maybe it's just frequencies maybe it's just what you do is you make measurements and then you take an average
And maybe if you take infinitely many measurements and you take the limiting value of the average, maybe that's what the mass of the object is. No. No, if you're lucky, that is, if you're lucky, then if you were to do that, of course, you can't do it. But if you were to do that, then if you're lucky, you would get something very close to the actual mass. But that isn't what the mass is. And I want to say the same thing about probability.
Suppose you're doing some quantum mechanical experiment, right? You can make measurements. That's what you do. You make a lot of measurements and you take averages and you do statistics. And that's how you estimate the probability that something will be observed in a quantum mechanical system. But that's not what the probability is. The probability is. what the theory says it is.
And whatever that is. So one property it has is, you know, you use Bourne's rule to calculate what the probability is. OK, that's a really complicated theoretical story. But the probability isn't any sequence of measurements. It's not any limiting frequency.
That's a symptom of this property of probability, but the property is what the theory says it is. So I just think the frequency view has got everything backwards. Frequencies are just the way we maybe know about probabilities, but they're not what the probabilities are.
¶ Objective vs Epistemic Probability
So that's my view. Oh, I'm very sympathetic to that. How does that view fit in with the sort of classic divide between thinking that probabilities are mostly epistemic, they're about our knowledge, versus that probabilities latch on to some... objective chances out there in the world.
Oh, I think certainly there are objective probabilities. As I said, I think not just in physics, but also in biology. I think each theory has its own concept of probability, and at least the probabilistic theories do. And what probability is in those...
systems is whatever the theory says it is it's just like mass um so that's i have a very flat-footed view of that and so in quantum mechanics well we know how to calculate probabilities the theory tells us to you know in statistical mechanics we can also calculate probabilities as well and we can we can do that as well you know but and now you might
wonder about the interpretation of those probabilities, but you can certainly calculate things which obey the laws of probability in statistical mechanics. And so in that sense, at least they are probabilities. They satisfy the formal principles of probability.
Yeah, I mean, so I want to say certainly there are objective probabilities, no question. I'm a scientific realist. So, you know, if I accept a theory and the theory says there's a thing, then there's that thing. That's it. So I'm a realist, so I don't have any problem with that.
¶ Adjudicating Theories with Probability
However, the problem is, and where things get really tricky, I think, and this is what got me really interested in other notions of probability. The tricky thing is, as I said, suppose there's a dispute. about the nature of probability in some physical context right there's a dispute about that you have two theories one theory says probability behaves like this another theory says behaves like that and then you do experiments
And you try to use that data to adjudicate. Does the evidence favor the one theory of probability over the other? Well... So whatever probability you're using there, you've got to be very careful because you don't want to beg any questions. So you don't want to use the probability that the one theory says it's correct but the other says it's incorrect to do the very calculations.
of how strong the arguments are. That would be question begging. So I think what you in those settings, you need some other notion of probability. And that's where I think the epistemic notion. comes in. I think you need it in at least these contexts where you're actually trying to adjudicate different physical theories of probability. You can't use what one theory says to adjudicate because that would just beg the question.
against the other theory. And so I think in those contexts at least you're going to need some other notion of probability, something neutral, like a judge is impartial. So you need some impartial notion of probability. And I think this is the kind of notion that statisticians have been trying. to come up with, you know, ever since that early work in genetics, which is where it all really started with Fisher and Haldane and all those kinds of people and Pearson.
But this is where I think Bayesianism is helpful, the Bayesian approach, because at least in these contexts where we're not sure, we're uncertain what the correct theory of probability is. We need something that feels like it's got to be epistemic. At least it's got to be neutral. And it's got to be something you can use to adjudicate. Does the evidence favor one theory of probability, say, over another? And for that matter, like I said, you want to adjudicate debates in other areas where...
Who knows what the probability of Newton's theory is? I mean, even if there are objective physical probabilities, it's hard to imagine how they would tell us what the probability is that Newton's theory is true. Good. Well, I mean, it's just – what would that mean? So I think in those contexts where we're adjudicating physical theory, say Newton versus Einstein or two different versions of quantum theory or something else, we're going to need a –
Some other notion of probability. And that's where I think the Bayesian approach is. You kind of need something like that because you need something neutral. I think I would have said 15 minutes ago that I don't believe that there is any such thing as objective probability in the world. I think that there's the world. And we describe the world the best we can, and maybe we have incomplete information, so we appeal to some probability, but there's some exact description of it also. But...
And of course, if you're judging between different theories of the world, then you have some epistemic view of probability. But now you're pointing out that, OK, but there's a notion of a thing that appears in a theory, whether it's quantum mechanics or... or whatever, and that thing obeys the laws of probability, it adds up to one and whatever, and we might as well call that objective. Yeah, yeah. I mean, just like I would want to call mass objective.
I would say the probability in quantum mechanics that's delivered by, you know, the Boren rule or whatever, however you calculate it, whatever that is.
It's some real thing. It's just as real as mass or any other theoretical quantity, it seems to me, that the theory implicitly defines through its laws. So, yeah, I mean, again, I'm a realist, though, so I have to just fess up to that. But as I say, even if you're... not a realist even if you think okay maybe there's different kinds of probabilities but none of them is objective in the relevant sense still if you want to know whether some evidence favors one of those theories over another one
And you want that to be a probabilistic inference, which it is because it's not going to be deductive. I mean, after all, scientific evidence doesn't entail the answer to these questions. It doesn't deductively guarantee that one theory is true and the other is false. It just at best. makes favors one over another and that's going to have to be i think the best way to model that is probabilistically but then you need a general framework of probability that's going to have to be
I don't know, epistemic or something less objective in that sense, because otherwise it would run the risk of just begging the question. So good. This leads us right into where I wanted to go, which is the...
¶ Hume's Skeptical Argument
idea of induction and how in the early days, people tried to hope that inductive reasoning, you know, looking at many, many examples and generalizing would be a kind of logic that would fit the scientific process. And then other people point out that there are problems with induction. So, I mean, pretend we're in the philosophy 101 class. Like, what are the problems that people have with induction?
Well, of course, in philosophy and epistemology generally, you generally start out with the really, really hard problems like skepticism. And induction is no different. When you're studying philosophy of induction, you tend to start with these skeptical arguments. You know, like David Hume had a kind of skeptical argument. He's like, well, okay, you say there are these arguments that don't guarantee the truth of their conclusions if their premises are true. Well, their conclusions.
Maybe they're quote-unquote probably true, but they're not guaranteed to be true like in mathematics. And he gave this dilemma. He said, well, let's think about how that would actually work. So suppose you've observed the sun rising a million times. And you infer that on the basis of that historical evidence that the sun will rise tomorrow. He points, Hume points out that, well...
That argument assumes some kind of principle of regularity of nature that, you know, the past, the future will resemble the past. And now if you ask, how are you going to justify that premise that the future will resemble the past? Well, you can't give a deductive argument for it because...
Well, how would you do that? I mean, nothing you've observed is going to entail that the future will resemble the past. In other words, there'll always be some chance that you can't rule out with certainty that the future won't resemble the past. So it won't be a deductive argument. And then if it's an inductive argument, it just feels like now it's going to beg the question because –
Well, wait, what are you going to do? Reason as follows. In the past, the future has resembled the past. So therefore, in the future – and now you're just – now you're just circular. Now it's just a circular argument because you're assuming the very principle that you –
¶ Responding to Skepticism
mean to justify in order to justify the argument so you know philosophy always starts with these skeptical arguments i mean you but you you don't have to worry about induction i mean this happens in every field like why believe there's an external world after all you can't rule out with certainty that there's
evil demon or that you're in a simulation or etc etc etc you know so what you I think what you got to do when you're doing philosophy there's sort of the first thing you have to do in any of these domains is figure out how you're going to respond to the skeptic that is What are you going to say? Why do you think that there's an external world? Let's start there, and then we'll get to induction. Well, why do you think there's an external world? Well...
I can only speak for myself. The reason I think there's an external world is when I think about everything that I take myself to know, everything I take to be evidence about the world. all my observations, everything I take to be true. And I think, well, what's the best explanation of all of that? To me, I don't see any way to plausibly explain all that stuff without postulating.
the existence of an external world that is mind independent in many ways. And that's why I think there's a mind independent external world. And now I want to take the same anti-skeptical view about induction. I think, well, how do you explain, say, the success of science or what appears to be the progress of science? Well, I don't know, but it seems hard for me to be able to explain that unless there weren't some principles of when evidence actually does favor one scientific theory over another.
¶ Einstein vs Newton Example
And does provide reason to believe one rather than the other. And so what I tend to do is think about historical cases that look like real scientific progress. And then what's the best way to explain that? For instance, when Einstein's theory of general relativity overtook Newton's theory of celestial motion, there were a lot of experiments that were crucial. One was the motion of Mercury.
the motion of mercury mercury moves in this very strange way around the sun and it was known that was known for a long time that it had this strange motion well the newtonians tried their best to give explanations of that
And in the past, they had had similar episodes, but they were able to explain it by some missing mass that they found, you know, that was in the universe that they didn't know about. And but eventually they realized now there's no there isn't going to be the right hidden mass here.
Newton's theory is just not going to be able to predict this. This is just a thing that Newton's theory can't explain, can't predict. And then Einstein comes along and gives a theory that explains all the stuff Newton's theory could explain. And this thing, too, and a bunch of other stuff that I couldn't explain. Now I want to say, well, that just seems to make it more probable that Einstein's story is true, or at least more probable that...
you know, that would be the better bet to make. That would be the more acceptable theory. And I think a probabilistic way of modeling that is just the best way that I know to model it. And so, again, I just think, well, what's the best explanation of these episodes of scientific progress?
And to me, part of that has to be, well, there just must be cases where the evidence really does favor one theory over another. Not that it guarantees that one's true and the other's false or anything like that, but it sort of raises the probability of one more than the other. And I just think, I don't know how else to explain episodes of scientific progress unless something like that is true.
So I believe that something like that is true. Now, the details of it are difficult to work out. But I think this is what statisticians, as I said, have largely been trying to figure out. how those inferences work. Like when we have an experiment and we think the evidence favors one theory or another, what's the right...
way to use probability, right, to model that. And there's a lot of disagreement, of course, in statistics between Bayesians and classical statistics. There's all kinds of different schools. But one thing they all agree on is there are episodes where the evidence favors one. theory over another, and probability is an indispensable part of the explanation why.
¶ Accepting Leaps of Faith
They all agree on that much. It might be unfair of me, but I do think that it's a very common phase in an individual's philosophical maturation to realize that not everything can be established on... rock-hard foundations that you agree with 100%. Like, sometimes you just got to say, this is the best we can do with what we got.
Absolutely. I think most of the time we're kind of in that in that situation. And that's OK. So I think but, you know, that's the nature of these inferences. As I said, it's not like deduction. You don't have the certainty of mathematics in these kinds of inferences.
So you know there's going to be something that's underdetermined. It's not going to exactly determine completely what our attitude should be. There's going to be some wiggle room, some leeway. So in a way, you're always making something. of a leap of faith when you do one of these ampliative or inductive inferences. And I just think you kind of have to live with that and do the best you can.
¶ Understanding Confirmation in Philosophy
And this leads us right into, you're very good at this, you're just bringing us along on the logical train of thought that we need to be on, the idea of confirmation. What we're trying to do is to formalize this idea, like you just said, that, you know, Einstein... theory is simple it fits the data newton's theory doesn't fit the data
In some sense, Einstein has now become more probably right than Newton. What sense is that? And confirmation is one of the words that gets batted around. I want you to really sort of carefully explain to us what that's supposed to mean, because I think many people... informally think that if you've confirmed something, you know it's true, 100%. And that's not how philosophers use the word. No, that's right. That's right. So yeah, in ordinary language, the word confirmation has very strong...
¶ Diagnostic Testing Analogy
connotations but in the philosophy of induction confirmation is actually a very weak it's actually a very weak claim and um i think a helpful example i like to use simple examples i think a nice nice example to use is one of diagnostic testing. I always like this example. And in a way, I think it's kind of fully general because in a way you can think of scientific experiments as a kind of diagnostic test where you're testing the world to see whether some hypothesis is true or false.
And so when you design an experiment, you really are in a way designing a diagnostic test. And so but let's think about diagnostic testing. So, for instance. There are many diagnostic tests that are very reliable that you can buy in the store now. So, for instance, you could buy a pregnancy test or an HIV test. Any of these tests that you buy, if you read the box, you'll notice something very interesting.
On the box, there's things they tell you and there's things they don't tell you. So one thing they tell you for sure is. what they call the true positive rate and the false positive rate of the test, right? So the true positive rate is something like this. Suppose that you have the disease.
then how probable would it be that you would get a positive result from this test? And then on the other, the false positive rate, suppose you don't have the disease, then how probable is a positive result? And the great thing about these diagnostic tests is you can determine those
those error rates in the laboratory. You don't need to know anything about the subjects, the particular subjects that are using it and so on. And that's why they can put that information on the box. It's very reliably known. Well, that ratio of the true positive rate to the false positive rate is called a Bayes factor. It's also called a likelihood ratio. And it doesn't determine how probable the hypothesis is.
¶ Priors and Likelihoods
given a positive result. It doesn't determine that. In order to know that, how probable it is that you have the disease, you have to plug in what's called a prior probability, an a priori probability, philosophers call it. And what is that? Well, That's something like the probability you how probable you think it is before looking at the evidence.
Okay, well, what is that? Well, of course, the guys who design the test, they can't tell you what that is. That's going to depend very centrally on things about you. So, for instance, suppose it's a pregnancy test. And if someone takes a pregnancy test and they get a positive test, result well they'll know the likelihood ratio they'll know that the error rates you know false positive and true positive so they'll know how reliable the test is in that sense but to get
How probable it is that they're pregnant? Well, they need to know a lot about maybe their own behavior in recent days and so on, which, of course, the designers of the experiment can't know and don't need to know in order to know the error rates. Right. So just to put an example on this. So if there is a pregnancy test that the likelihood is very high, like, you know, it is claimed that if it comes out positive, the likelihood
that you're pregnant is very large. But if I took a pregnancy test of that form, I am biologically incapable of becoming pregnant. I know that with pretty high probability. So if I happen to get a positive... I would not conclude that my probability of being pregnant is high because my prior is so low.
Exactly. Exactly. In fact, it might even be zero, you know, depending on the case, but it'll be very close to zero. And that's exactly the distinction that I want to make. This distinction between that Bayes factor, that...
how reliable the test is, which is just the ratio really of those two error rates, that could be really high. But all that tells you is what to multiply the prior by to get the posterior, basically. It's like a multiplier. So if you start off... low but not that low and then you get a really reliable test well maybe it's a multiplier by a factor of a thousand well then you're going to have a reasonably high probability but if you start really really low then even if you have a pretty high factor
¶ The Base Rate Fallacy
a multiplicative Bayes factor, still you're going to end up low. And this, people are very bad at making these inferences. This is something that Kahneman and Tversky discovered back in the 80s. They called it the base rate fallacy. And when people are given an example like this where, okay, so you have a reliable test for a rare disease, they're told the disease is rare, like...
one in a thousand. And then they're given pretty good error rates and they say, well, and then they're asked, how probable is it that the person has disease? And often people give a very high number. And in fact, interestingly, the numbers tend to cluster around. basically the Bayes factor, if you normalize it to a zero to one scale. And I don't think this is a coincidence. I think what's happening here is you have two factors. There are two things that are relevant here. There's how probable...
It is that you have disease, the probability of the disease. And then there's the confirmation. There's how much the evidence confirms. And that's just how much does it change? How much does it raise the probability? And I think. In these cases, what you have is low probability but high confirmation. That can be very confusing. Right. Because both of these things are relevant to, quote, unquote, how strong the argument is.
But that is how strong the evidence is as a reason to believe that the disease is present. But they go in different directions. So it can be very confusing. And then you might – it's still a residual question. Well, why would people defer to the relevance, to the confirmation number, right, when they're asked about probability? I think this is not a crazy thing to do at all. As we said, those –
Error rates are objective and invariant in a really important sense. You can just discover them in laboratories. You could just by working with the causal structure of the test and the chemicals you're looking for, you can be pretty confident about those error rates. independently of the prior probability. And so there's something more objective about those numbers. And, you know, there's something really ironic about the Kahneman-Tversky research, because...
¶ Science Focuses on Confirmation
If you read their own paper, well, that's a scientific paper. And so what do scientific papers do? Well, they generally design an experiment and then perform an experiment. And the experiment generates evidence.
What do they tell you about the experiment? What do they tell you about how to interpret that? Do they tell you how probable their hypothesis is to be true given the evidence? Of course they don't. Just like the diagnostic test maker can't tell you how probable it is that you have disease. That relies on this prior information that they don't know. Science is the same way.
When you design an experiment, what you're really doing is trying to get maximum conformational power out of the experiment. You want it to be as much of a multiplier of that prior probability as you can.
either either a multiplier or a divider if it's if it's evidence against then okay then it's kind of a divider of how probable it is it makes it smaller makes the probability smaller but the point is it's not probability that you're you you can't maximize the probability that your hypothesis is true that depends on the prior and different scientists are going to have different priors when they when they look at experiments so all you can tell people basically is
what the likelihood ratio, what that base factor is of your experiment, including Kahneman Tversky's own experiment. So there's this real irony. They're implicitly criticizing human beings for being bad at doing a thing. that their own paper doesn't require scientists reading the paper to do. In the big picture, I was a little cheeky. I put this idea as everyone's entitled to their own priors. No one's entitled to their own likelihoods.
Exactly. And I think that's exactly right. And so I think there's something. Not irrational here about deferring to the likelihood information. After all, that's the objective. That's the invariant information that we can know. And that's how science works, right? Scientific papers, they basically report base factors or some.
something about whether the evidence favors one theory over another. They don't tell you how probable it is that one theory is true or the other theory is true. They know that's going to depend on these priors. And they don't know the prior probabilities of their readership depends on what their readership knows.
And so our self-appointed task is to come up with a formal understanding of this idea of confirmation. Like, clearly it's important. I mean, maybe you have your own priors. Maybe you disagree or maybe you agree about them, but we should be able to quantify how. much. The new evidence is confirming our theories. And it's also, like you say, but maybe it's worth emphasizing, it's weaker than entailment, than from deductive logic. We're familiar from high school.
And if P, then Q, therefore Q. Like that sounds solid. That sounds logic to us. And we want a logic of confirmation. Yes, yes. And we can have one.
¶ Measures of Confirmation Disagree
basically those base factors they give it to you um one thing that's really interesting about this literature and is this is really what my this is what i really got interested in when i was in grad school i wrote my dissertation on this if you look in the literature on
probability, statistics, Bayesianism, any of that literature, there's lots of measures of this confirmation. There's lots of measures of, say, degree of correlation. So correlation is another word for confirmation. It's just when one thing raises the probability of another. Right.
There's lots of measures of how strong that confirmation is. One thing you could do is just take the posterior probability and subtract off the prior probability. And you could say, well, that's one way of measuring how much of a difference the evidence made to the hypothesis. But there's many ways to do it because it turns out that you can define correlation in many equivalent ways. So one way is the posterior is greater than the prior. That's one way. But another way is that.
the true positive rate is greater than the false positive rate, right? Or greater than one minus the false positive. So the probability, the evidence given the hypothesis is greater than the probability evidence given the denial of the hypothesis. Yeah. And that's equivalent qualitatively. Those are going to be true at the...
But if you define measures based on those inequalities, they're actually different. They don't agree on which thing is better confirmed than which. They actually disagree on orderings of how well-confirmed hypotheses are. And so they can't be measuring the same thing. It's being confirmed or disconfirmed, but they don't agree on how much. Exactly. And so if you want to measure it, which of course we do, we want to know how much.
Then you've got to pick one of these many and there's dozens of measures and they all disagree. And this is what I survey in my dissertation. And so you've got to pick one. Now, the good news is that if you're an inductive logician.
¶ Bayes Factor as Unique Measure
which is a certain tradition that I'm a member of, you've got, you actually have a criterion that allows you to narrow things down to a unique measure. And it turns out to be the base factor, the same thing that people report on the boxes of the diagnostic tests. And it's a very simple criterion. The criterion is, however we're measuring this confirmation, it should be such that it generalizes entailment in the following sense.
did entail the hypothesis, if it guaranteed that the hypothesis was true, then that should receive a maximal value of confirmation. And if it refuted the hypothesis, entailed that it was false, falsified it, then that should be a minimal value. Just add that as a criterion, and you're basically uniquely down to this Bayes factor. Good.
And so that gives us, if we're in the framework of inductive logic, now we actually do have a unique way of measuring. And it just turns out, and I'm not sure this is a coincidence, but it turns out it's the very same Bayes factor that they tell you when you buy a diagnostic test.
Good. Yeah, I'm now going to look in stores for diagnostic tests that tell me what my priors should be. Right, that's right. It's probability nine-tenths that you're pregnant, no matter who you are. So this might be a tiny little...
¶ Popper and Falsification
aside, but I remember when I was young and taking my first philosophy of science course, when we came to Karl Popper. We were taught that his notion of falsification was supposed to be a better thing to think than the old-fashioned logical positivist notion of confirmation. I know now that we weren't actually told what that...
old-fashioned logical positivist notion of confirmation actually was, or at least it didn't become clear to me. But what is the difference between those two ideas? Yeah, so that's a great question.
So I think Popper was right in a sense. There is an important asymmetry when you think about degrees of confirmation. So let's think about how strongly does something... that refutes what's the confirmational impact of that versus something that's that doesn't refute well as i just said our criterion requires refutation to be
That's the worst. That's the most negatively relevant you can be. And so in this sense, this is the kernel of truth of what Popper said. Refuting evidence is more powerful than non-refuting evidence. Good. As a negative evidence and that's absolutely true. He's absolutely right about that That's in fact one of the criteria that we use to get down to a unique the base factor measure So I think poppers are what he
What he wasn't right about was that all there is is refutation. So Popper had this weird view that there's no such thing as inductive arguments. Here I think he was influenced by Hume. I think he really got hooked on that skeptical argument. And he thought, well, the only arguments that could be compelling must be deductive. So there aren't any inductive arguments.
Well, then everything must be refutation. That is right. That was all that would be left. You couldn't have disconfirmation in a weaker sense because that doesn't exist. I, as I said, I'm not a skeptic. I'm an anti-skeptic. I think we know a lot of stuff. I think we can make distinctions between refutation and just negative evidence that's not refuting. Now, of course, it's difficult. It's an art. decide on a probability distribution to use to assess these things.
Yes, you do have to do that at the end of the day or at least enough constraints on probability so that you can say like what the likelihood ratio is or something like that. I mean, you need some probabilistic information to do that. But I think we can. obtain such probabilistic information by doing statistics. So I'm not a skeptic at all. I mean, I don't have a problem. So I kind of don't worry about the skeptical arguments and epistemology at all, including an induction.
¶ Carnap's Mistake Single Function
But let me just say one more thing. I think Popper was also right in his criticisms and some of his criticisms of the logical positivist. So Carnap was probably the real best exemplar of someone who tried to develop a logical empiricist inductive law.
and a lot of what he says in his work is great and useful but there's one I think key mistake that he makes and that a lot of people have made and that is this he thought and I think many people still think amazingly that There must exist a single probability function such that every argument's strength can be measured with that one function.
I think this is wrong. What do you mean by a probability function in that sentence? Yeah. So there must be some probability distribution over the relevant propositions. Okay. Right. Such that for any argument, as if there's this this. They used to call it – well, some people called it like the super baby's probability function or something. There's this one probability function that can assess accurately the strength of any conceivable argument. And I just think this is absurd.
It doesn't exist. There's no such thing. But I do think there's a weaker claim that is true. I want to say that for every argument, there exists a suitable probability function.
Such that when you use that probability function to assess the strength of argument, you get a pretty accurate assessment of how strong the argument is. And so I just want to reverse the quantifiers. This idea that there's one... in the sky that works for every argument no that's what kind of thought he was wrong about that but i think it's true probably that for every argument there's some suitable probability distribution that works that gives you the right assessment of what the evidence
favors or how strong the evidence is. Is Carnap's idea either identical to or at least related to an idea that we could find the one true set of priors for all these propositions? Yes, that's right. That's another way of thinking about it. If you're a Bayesian, then you'll think, so-called objective Bayesians, think that there's one probability function that will rule them all or something like that. And, of course, that just won't work. I mean, you can just, it's very easy to, and this.
is what Carnap did for about 40 years. He kept getting more and more sophisticated counterexamples for whatever specification of the single family of probability distributions. And I just think this is a fool's errand. You don't need to do that. Science, the way I think about science is you have a theory. So this theory is just probability calculus with your base factor and your conditional probability. Okay, that's your theory of inductive logic.
And now to apply the theory, you have to construct models of particular arguments and particular contexts. That is an art and a science. It's going to involve a lot of statistics. It's usually going to be empirical. It's going to involve a lot of extra work. It isn't going to be knowable a priori. But why should it be?
Yeah. You know, I just so I mean, that was the logical empiricist dream that it had to be noble a priori. And so there had to be just this one probability function. You could divine a priori to determine all the answers. And I just think, no, that's not how science works. There are uncountably many.
probability distributions. Don't tie your hands by not allowing yourself to use ones that science tells you are appropriate. And so that's just now that's going to be empirical matter of constructing models.
of real arguments and this is going to be hard work and there's going to be it's going to in many cases it'll be controversial but this is the same thing that happens when you're constructing models in science you got to make all kinds of assumptions idealizations approximations and it's going to be controversial how to do that that the right way yeah that's itself part of science you know and who said it was going to be easy
Nobody said it was going to be easy. That's for absolutely sure. I don't think they did. But okay, as someone who lives in Baltimore, home of Edgar Allan Poe and the Baltimore Ravens, I am very fond of...
¶ The Raven Paradox
what we call the paradox of confirmation like as soon as you have this idea that you're going to start confirming things you get in trouble and the philosophers come along to tell you it's not going to be so easy either Yes, there are many paradoxes of confirmation, but I think you're thinking of the Raven paradox, Hempel's paradox. Yeah, this is a classic.
So the way this one goes is it involves a specific kind of hypothesis, something like this. All ravens are black. That's a hypothesis we could have, we could formulate. Suppose we hypothesize that all ravens are black.
And if you want that to work, the way we usually... think we're confirming that is we make a lot of observations you know so we observe a whole bunch of positive instances uh and we think by the more positive instances we observe more you know by and large the the better supported this hypothesis is okay but that assumes that even just a single instance would provide some support and maybe just a tiny amount but it'll raise the probability a little bit of the hypothesis which is a plausible idea
The problem is if you accept that principle that a positive instance provides some support for a universal claim. So like the observation of a black raven should support a little bit that all ravens are black. Of course, you need many to do a lot of confirming, but one does something, right? That's how you get started. The problem with that is if you accept that and then you accept the following principle, which sounds very plausible.
that if a piece of evidence supports a hypothesis, then it supports anything logically equivalent to that hypothesis. Sure. That seems right. I mean, logical equivalence, that's a really strong form of equivalence. So anything that's evidence for something should be evidence for something logically equivalent. In fact, we would just think they're the same hypothesis. Well, okay.
All ravens are black is logically equivalent to all non-black things are non-ravens. And now what's a positive instance of that hypothesis? Well, it would be the observation of a non-black non-raven. Okay, but now you get the conclusion that the observing non-black non-ravens confirms that all ravens are black.
Okay, that doesn't sound good because it sounds like you can engage in what Nelson Goodman used to call indoor ornithology. You just observe a bunch of shoes or – uh you observe a bunch of white shoes you know a bunch of non-black non-ravens and you're going to get a lot of confirmation for the hypothesis well that's definitely a problem but
this is where the quantitative theory of confirmation helps so yes let's suppose you get some confirmation right but now that leaves open the following question might it not be the case that The amount of confirmation provided by the observation of a non-black number is much, much less, you know, in the circumstances we think we find ourselves in than the observation of a black raven.
And in fact, given very plausible assumptions about statistical sampling or however you're modeling, you know, the usual statistical models of observing these things, given very plausible assumptions about. The world, you know, here's one assumption. There are a lot more non-black things than there are ravens. That seems right. OK, so that and if you think that's true and.
It's still true even if you suppose that all ravens are black. That is, that wouldn't affect much the relative proportions. Then it just follows that you're going to get more support. of the hypothesis by the observation of a black raven than by the observation of a non-black non-raven. So this is where the quantitative theory really helps. And statistics gives us that. It gives us a quantitative way to estimate how much of an effect
an observation has. And so given very plausible assumptions, it's just going to be, yeah, you get some evidence, but it's extremely weak compared to the evidence you get from Black Ravens. And you can make this much more precise and you can show that. In general, it's just much more informative to, say, sample from the ravens and see if they're black than sample from the non-black things and see if they're...
non-ravens, right? And you can just make this very quantitative using the theory of confirmation, just these Bayes factors, and given very plausible assumptions about what we think the probability distributions look like, it's just going to follow that.
The best way to do the experiment is to sample from the ravens and see if they're all black, as opposed to sampling from the non-black objects and seeing if they're not ravens. Well, and for the non-philosophers out there, just to remind them that this notion of confirmation is extremely weak.
¶ Weakness of Confirmation
When you say observing a white shoe confirms that all ravens are black, it's closer to supports. You even use supports a couple of times there as a synonym. It provides a tiny amount of evidence that might be really, really tiny. Yeah, it could be – it's just some bump. It just means the probability goes up. But it could go up an –
A tiny amount. And in fact, this is what we think happens when we sample from the non-black things and see whether they're non-ravens, as opposed to sampling from the ravens, seeing whether they're black. We just think there's a much larger effect there. So although there's some effect. Yeah.
It's not like it totally gives you no information. And by the way, it's plausible that you should get some information because if you observe a non-black, non-raven, then what you've done is you've ruled one object out. You know that there's one object in the universe that can't be a counterexample.
to the hypothesis. And so in that sense, yes, you've gotten maybe a tiny bit of support, but it's absolutely minuscule compared to what happens when you sample from the ravens and see if they're all black. Okay, good. So I'm on board the confirmation train here.
¶ Conditional Probability and Relevance
But you mentioned in passing the idea of a quantitative measure of this confirmation factor. In one of the papers that you wrote that I actually read some of, you go through different... plausible suggestions for what the equation should be for giving you what that confirmation factor is. And there's something called the received view that, what do you call the received view? Do other people also call it the received view? I don't even know.
Well, it's just – yeah, I think it is just kind of the conventional wisdom about how to think about strength of arguments. Right, OK. And that relates this confirmation factor to a conditional probability. And I know that – Some large fraction of your intellectual life has been thinking about conditional probabilities. So why don't you tell us what a conditional probability is and why it might be related to a confirmation? Yeah. So one thing you definitely want to know.
It's just going back to the disease case. One thing you definitely want to know, maybe the most important thing you want to know is how probable is it that you have the disease conditional on or given that you get a positive result? That's called the conditional probability. And the way it works is you do this. You suppose that you get a positive result. And then you ask yourself, given that supposition, supposing the world is that way, how probable is it that I have the disease?
And that's sort of the natural way of thinking about it. And so conditional probabilities are essential to induction. But of course, there's many different conditional probabilities. There's the probability of H given E, that posterior probability. That's really important. But there's also the likelihood, the probability of E given H, that true positive rate. And there's also the probability of E given not H, the false positive rate.
E and H are evidence and hypothesis? Yeah, E and H are evidence and hypothesis. So E, let's say, is a positive test result. H is that you have the disease. And of course, what you want to know is how probable is H given E, right? Supposing E to be true, and then if you learn E...
Then you update. You update and you accept as your new probability the old conditional probability. That's sort of the Bayesian way of doing things. And, yeah, you definitely want to know that, of course. That's a very good thing to know. But knowing that. requires you to know not just the true positive rate and the false positive rate of the test, but also the prior, the unconditional probability, the probability prior to the evidence before learning how the experiment turned out.
And of course, that's going to vary very greatly from subject to subject, from person to person who's judging the evidence. So conditional probability is super important. And I still want to say. That is one of the features that makes something a strong argument. You definitely want the hypothesis to be more probable than not, at the very least, given the evidence, if you're going to believe it, if you think it's a reason to believe it.
That's part of the story. And that's the conventional view about how strong the received view is. If you want to know how strong an argument is, just calculate that posterior probability, the probability of H given E. And that tells you how strong a reason is E is for believing H. But that can't be right. It can't be right because take you or me. If we take a pregnancy test.
Look, the likelihoods are still the same if we happen to get a positive result, which is, of course, possible because physics and not because things aren't impossible. We could get a positive result. Yeah. Well, but we don't think that's a good reason to believe that we're pregnant because we know we're not. So what that means is there's another dimension to the assessment of the strength of arguments, and that is what we've been calling confirmation.
¶ Two-Dimensional Argument Strength
And basically, I want to say it's just the ratio of those two error rates. It's just the base factor, the likelihood ratio, whatever you want to call it. That's the way we measure that second dimension of confirmation. And so I want to say I'm offering a two-dimensional theory.
of argument strength. For an argument to be strong, it's got to be probable. Sure, yeah, it should be more probable. The conclusion should be more probable than not given the premise, or in this case, the hypothesis should be more probable than not given the evidence. But also, the evidence should be relevant.
If the evidence is irrelevant, it's not a reason to believe the hypothesis at all. Right. So if you have an argument where the premise is just irrelevant, doesn't affect the probability of the conclusion at all, then I don't want to say that's a strong argument because that. it's not a reason to believe the conclusion at all okay and so this was something that the classical inductive logicians just ignored not just carnap but if you read books on inductive logic
all the way up through Brian Skirms' book, which is one of the state-of-the-art books from the 2000s. They just give you this one dimension, the probability of the conclusion given the premise. But I just think that can't... Be the full story because relevance confirmation also matters as to whether something should affect your beliefs.
So let me try to rephrase it because I'm not sure I wrap my brain completely around it. The classical story would say if the probability of the hypothesis given the evidence is very high, then that counts as confirmation. But what if, for example, the probability of the hypothesis...
is just very high. What if we're already convinced of it? Then it could be also high given the evidence, but you wouldn't count that as confirmation. Is that the idea? That's right. That's right. In fact, it could even be highly probable given the evidence. But the evidence makes it a little bit less probable. You definitely don't want to say that's a reason to believe. No, if anything, it's a reason to believe that hypothesis is false.
It just so happens that it happens to have still a high probability anyway, given the evidence. But that's probably because it had such a high probability to begin with. Okay. It's not, it's not that the evidence is a reason to believe that hypothesis. And so when, as logicians, what we want to know is not whether we should believe the conclusion simplicator.
But we want to know how strong the argument is as a reason to believe the conclusion. And that, I claim, requires both probability and relevance, confirmation. And Simplicator is weird philosopher talk for all else being equal. Yeah, that's right. That's right. And sure, if the thing is relevant, then all that matters is the probability. But if it's not relevant, then it's not a strong argument, I would say. Good. So...
That sounds perfectly plausible, but of course we're going to want to know what is the way to know whether something is relevant. Is that just like a vibes-based thing, or is there an equation? There's an equation, and it is just that.
the thing they give you when you buy the diagnostic test, they give you this ratio of the two error rates, the two likelihoods, the probability of E given H and the probability of E given not H. And you take that ratio, that's a really good measure from an inductive logical point of view. It's pretty much the only one.
That's going to satisfy these desiderata we like. And so that's how I propose. So I'm proposing a two dimension. So you can visualize it as like a Cartesian space. The x-axis is the conditional probability, the conclusion given the premise. And the y-axis is. That likelihood ratio, that is, that measures how much impact, how relevant the premise is to the conclusion or the evidence is to the hypothesis. And sort of. Yeah.
So there's no one number at the end of the day. It's not like you add those two together or you add their squares together or whatever. It's just you've got to give me both numbers. Yes. And I think this is a really fundamental thing that's so important to emphasize.
¶ Mixed Cases and Ambiguity
I think one of the real deepest mistakes that was made in the history of inductive logic was that they thought there'd be a single measure on which you could totally order all the arguments in terms of their strength. A single function that takes a premise. and a conclusion, and a probability distribution. And it gives you a single number. I don't think this can be done. I think what it gives you is a ordered pair. It gives you a probability and a Bayes factor. Good.
and that's all i think in general that can be said now of course you can say something there's a there's some ordering because if if the evidence if you move up both in terms of probability and relevance, well, then you've gotten stronger because you've gotten stronger in both dimensions. But these mixed cases, this is the problem. Cases where you have improbability, but high confirmation, like the base rate fallacy.
Or cases like the conjunction fallacy, which also involve relevance going one way, confirmation going one way, but probability going the other way. And so these mixed cases, which I think it's no surprise, they led to the Nobel Prize about concerning how.
quote unquote, bad people are a probabilistic reasoning. I think it's because the cases are mixed that people get confused. If you ask someone how strong is an argument? Well, if that has two dimensions to it and one of them is high and the other is low. It's ambiguous. The question's ambiguous. And so you might not blame them so much if they're a little confused about those arguments where you have high relevance and low probability or, you know.
high probability and low relevance. Those are hard cases to assess for most people because they realize both factors are relevant and what they're being asked for is a single summary, a single assessment, but maybe there isn't. Maybe it's ambiguous. Maybe it's strong in one sense, but not in the other. And so I, in general, want there to just be two dimensions. And so I don't think there's a total ordering. There's a single number you get for any argument and any probability.
There's going to be two numbers, I think, in general. And I think that's one of the mistakes. Has everyone basically agreed with your impeccable logic here? Well, I mean, some people have. So they're, you know, in psychology. So we did. I had the pleasure of working with some psychologists on these quote unquote reasoning fallacies. And yes, there's a lot of experimental evidence now that.
It's the mixed cases that are hard, and they're hard because they're mixed. And so if you fiddle with the confirmation that is the relevance, you fiddle with that Y dimension, it's really going to affect how good people are making judgments about the X dimension. And so and I think this is because what people really care about is not just how probable the conclusion is given the premise. They care about how strong is this as a reason to believe the conclusion.
And intuitively, they know that depends not only on the probability, but on whether the evidence is relevant, whether the evidence confirms the hypothesis. And so there's a lot of psychological evidence now that. That notion of confirmation really is relevant to explaining what's going on in these cases. So let's go through some of these cases a little bit more carefully, because I'm sure that people kind of...
¶ The Conjunction Fallacy
vaguely heard of them but you know it's always good to be clear the the conjunction fallacy i think you already mentioned and it is one of my favorites because i was not fooled by it when i first saw it but uh i saw why i could be fooled by it so i'm sympathetic Yeah, yeah. That's a great one. That's a great one.
The way that one works is you're given some evidence about a woman named Linda. You're basically told that she went to Berkeley in the late 60s. She participated in anti-nuclear demonstrations. She was very active. politically and so on and so forth. She was like a flower child and so on and so forth. And that's the evidence you're given. And now you're asked, this is years later, you're asked, okay, now I have two hypotheses I'm going to give you about Linda nowadays.
Either she's a bank teller or she's a feminist bank teller. And you're asked, which is more probable given the evidence? And back in the day, a lot of people said feminist bank teller was more probable, given that evidence. Of course, that's impossible, because feminist bank teller entails bank teller.
Every possible world in which he's a founder is a world in which he's a bank teller. And since probability is just a measure of how big a class of possible worlds is, it couldn't possibly be that the conjunction is more probable than one of its conjuncts.
violate basic logical and probabilistic principles so that can't happen so what's going on well what we showed in a paper that we wrote and there's been a lot of research on this since then is that two very simple assumptions If two very simple assumptions hold, which I'm going to give you in a second, then it's just guaranteed that while, yes, the bank teller hypothesis is going to be more probable than the feminist bank teller hypothesis.
The evidence will actually confirm the feminist bank teller hypothesis more strongly. It'll be more relevant to that conjunction than it is to the first contract. And here are the assumptions. They're very weak. First assumption. The evidence isn't positively relevant to whether she's a bank teller. That seems plausible. Okay. Second assumption. Suppose she is a bank teller.
The evidence I give you still positively relevant to some degree to her being a feminist. Maybe only a tiny amount, but still somewhat relevant to her being a feminist. Those conditions entail that for any way of measuring confirmation, for any of the measures. It turns out the evidence will confirm the conjunction more strongly than it confirms the conjunct. And so these are cases. They're mixed cases. You have a case where probability goes one way. Bank teller is more probable. But.
Bank tellers less relevant. It's less well confirmed by the evidence. And I think it's, again, no surprise that just like in these rare diagnostic testing cases, rare disease cases. which are called the base rate fallacy cases, which we already discussed, just like in those cases, these cases involve one of the dimensions of assessment probability going one way and the other dimension of assessment of the strength of argument, the confirmation or relevance going the other way.
And I'm not at all surprised that people defer to relevance. It makes sense. We already saw relevance is in many ways more objective. It's more invariant. It's sort of the language of science, the way science understands evidence. It usually thinks in terms of how much the evidence confirms, not how probable hypothesis is. That depends on all these idiosyncrasies about prior probabilities.
I'm not at all surprised that people do any of these things. So I say it kind of makes sense that when the confirmation goes one way and probability goes another way. deferring to the confirmation kind of makes sense since there are many ways in which confirmation is just more more important more informative more objective than than probability is so i have a slightly different or i had um for a while after hearing about the
experimental results, a slightly different hypothesis about what was going on, but I'm not sure if it's slightly different. So let me explain it to you and you tell me if it's different. I'm wondering whether or not when people hear the evidence, which... In this case is Linda went to Berkeley. She was a flower child. She was an activist. And then they're given the two hypotheses. She's a bank teller or she's a feminist bank teller.
implicitly, they assume that being a bank teller means that you're a typical bank teller. And being a feminist bank teller assumes that you're a typical feminist bank teller. And the typical bank teller is not feminist. So there's some sort of interference or tension between the hypothesis that she's a bank teller and the evidence that she was a flower child. It's still sort of a mistake with the question phrased as it was.
But, I mean, that would be a way of psychologizing why we make the mistake. I'm not sure if it's the same as your way or different. Well, yes. So there have been many proposals for different things that might be going on. One of them that was.
received a lot of attention early on which is similar it's in some ways maybe you can tell me i think it's related to what you were uh saying is it was originally postulated that actually people were hearing the question slightly different they're hearing it as Feminist bank teller versus non-feminist bank teller. Yeah.
Actually, there's definitive psychological research that that's not what's happening. So I can point you to papers that are just absolutely stunning on this by some of my psychological colleagues. Okay, so there are experiments where first... they teach people how to do deductive inferences. They teach them how to infer conjuncts from conjunctions. They teach them all this stuff. And then they have them bet. They have them do betting and they still a lot of people bet more on the conjunction.
Even though they know that the thing follows, they've actually gone through the logical exercise of it following logically. Right. That one hypothesis entails the other. So this has been controlled for. I think, in my opinion, this particular hypothesis is actually – there's a lot of evidence against it. it now. So I find the relevance approach, the confirmation approach more plausible given all the evidence.
But of course, this is the active area of research. There's some even more recent research trying to refine the notion of relevance to go beyond confirmation and take into account other pragmatic kinds of relevance as well. I think that's really fascinating research. But there's pretty strong evidence now that this second dimension, I'm calling it, of argument strength is making.
a significant difference there may be many other things that are making a difference but it's pretty clear it's making a difference i kind of love the intersection of the actual psychology experiments with the philosophical reasoning at the most abstract level. It does, you know, the rubber does hit the road at some point. Oh, absolutely. To me, that's... One of the most interesting areas of research in general is that borderline between the descriptive and the prescriptive.
That's a really it's such a difficult area, but it's such an it's such an important area because after all, what we're interested in is evidence for humans. You know, like, you know, this is another weird thing about logical empiricism. Who cares about evidence if it's just some purely formal logical relation between things? How does that actually bear on what we ought to believe?
So that's another problem with the whole kind of logical empiricist way of thinking. It's very disembodied and abstract, and it's just unclear why it would ever have any purchase on humans.
¶ Wason Selection Task
OK, so let's I think one more example might seal the deal here. And you suggested the four card problem, which I do remember. Look, I looked it up. You had your paper, but your paper is full of like all these.
equations and things so i just looked it up on wikipedia to remind me what it was and i do remember coming across the four card problem and i that one i did get right just because i've uh done probability problems before but but it's i see the here but the argument plays out in a slightly different way so why don't you tell us what the problem is yeah so there's this famous case of the waste and selection task is what it's called and
So the way it works is there's cards, and there's different variants of it. So I'm trying to remind myself of... the version that we that we actually worked on because i don't want to talk about a version that i don't i actually don't i wrote it down if you want me to give the the problem and then you can explain yeah could you do that and then i can yeah the version that i know um from your paper is that
that there are these cards and you know that there is a number on one side of the card, a letter on the other side of the card. You know that. And you're shown four cards. One says the letter D. The other says the letter K. I don't know if this is for Daniel Kahneman or not. I don't know where these letters came from. Then it shows the number three and the number seven.
Okay, so DK37. So obviously you showed the letter side of two of them, the number side of the other two. And then the hypothesis is... All cards that have D on one side will necessarily have three on the other side. And the question is, which cards do you have to flip over to most efficiently... test that hypothesis that if d is on one side three is on the other side and you've shown dk37 yes yes and so yeah so this is this is a great
So we wrote this paper a while back. Me and Jim Hawthorne wrote this. It's my favorite paper I've ever written still to this day. And so it's about this waste and selection task, which people make a certain kind of mistake in, tend to. And its relation to the paradox of confirmation, which we already talked about. So you remember back when we were talking about the paradox of confirmation, that...
It's a better strategy to sample from the ravens and see whether they're black than it is to sample from the non-black things and check whether they're non-ravens. It's just more confirmationally powerful to sample. from the ravens and check and see if they're black this turns out to be an isomorphic problem this is this problem is basically the same problem okay um
Because, so what hypothesis are we being asked to test in this case? So we've got the four cards, D, K, three, and seven. And what hypothesis are we being asked to test? If D is on one side, then three is on the back. That's right. So all D cards are three cards. Yes. Or you could just say all Ds are threes. Yep. okay now all d's are threes same structure as all r's are b's all ravens are blacks and exact and the same kinds of things happen so what you what you want to do
is if you think about back to the Raven case, what did we say? The best strategy is look at the Ravens and then check and see whether they're black. The analogous thing here would be check the D card. And then turn it over and see whether it's a three on the other side. That is exactly the analogous thing. And the same models will show that that's the most efficient.
way to to respond to this and in fact if you just use some very weak assumptions about probability and you use this confirmation measure that we were talking about then you can actually rank the strategies in terms of their confirmational power and it'll turn out given very weak assumptions about what's going on that d turning over the d card is the best then next turning over the three card
Oh, sorry. No, that's what people actually do. Sorry. Right. That's what people actually do. So what people actually do, this is great because I just actually did it. What people actually do is they turn over the three card. That's the second best strategy. But that isn't the second best strategy. Right. Yeah.
They think that they're trying to confirm, but that's not the best way to learn. Yes. What you should be doing is looking for counterexamples, right, next. So you should turn over the seven card, right, and see whether it's a D. Yes, exactly. And this is exactly what we show because we actually show that the two cases, the paradoxical information and the wasting test, are actually isomorphic. They have basically the same structure.
And you can use the same kinds of probability models to model them. And when you do, you get exactly the prescriptions in both cases. You get best thing, sample from the raven, see if they're black. Next best thing is. Look at the non-black things and see if they're ravens, right? Look for counterexamples, right? Same thing here. But what people actually do in this Wason test, which is really interesting, is...
They reverse the second and third strategy. So what they do is they'll say D first, but then they'll say three. Yeah, they'll say, no, turn over the three card when that's definitely less informative. And here the Popperian intuition really is correct.
you should be trying to refute next. You should be looking at the seven card. And as I was saying, the Popperian thing, the kernel of truth of Popper comes out in this paper because basically you can just show that After sampling the D card and looking to see whether it's a 3, the next best thing is looking at a 7 and seeing whether it's a D.
That's Popper's intuition, basically. And people aren't Popperian, it turns out, because they think turnover of the three card is better than turnover of the seventh card. But actually, it's very easy to show, just using very weak assumptions about probability, that... That's wrong. And so in a way, this is a vindication. It's a Bayesian vindication of Popper. That's one of the things I like about this paper. It tells you the kernel of truth in the Popperian falsificationism that in this case.
Going for the falsification is better. It's the second best thing and not the third best thing, which is what people tend to think it is. Yeah, but the thing that people— tend to do, they reason if it can be called that. They think, well, your hypothesis is that if there's a D on one side, there's a 3 on the other. If I flip over the 3 and I see a D on the other side, that will confirm.
give some evidence for this. In the space of all possible cards, that's a more likely thing to see. Yes, and it will confirm, but because refutations are always more powerful than non-refutations— Exactly, yeah. That's the Popperian insight. And that's why Popper was correct. So, yes, you're absolutely right. It's a kind of confirmation bias. And in our paper, we actually prove, given very weak assumptions, that the only way to get that ordering is if you come in.
into the experiment with a confirmation bias, that is, you think you're more likely to see positive instances rather than counterexamples. Exactly, right, good. And you can just prove that. That just follows from the very weak modeling assumptions we have, that the only way to get that ordering is going to be if you come in already thinking that you're more likely to get confirming instances rather than refuting instances. which is sort of a classic confirmation bias. And it is, but it's not...
¶ Lessons for Doing Science
It is a bias, and I think that in this case, you know, the parameters are sufficiently clean that doing D in 7 is clearly the right strategy here. But the real world of science is complicated, right? I mean, I guess, you know... We're getting late in the podcast. We can let our hair down and think about less... completely logically rigorous deductions here i mean are there lessons for how we should do science like scientists are constantly arguing about what
experiments are the best ones to do. Obviously... It has to do with the probability that your different hypotheses are true, your priors, which of course we don't agree on, but also I think you would argue the relevance of that experimental result to changing your beliefs. Absolutely. I think a great way to think about experimental design is to think what you're doing is you're trying to maximize the conformational power of the evidence generated.
And that could be so that's neutral as to whether it's negative. negatively relevant evidence which it might be or positively relevant but what you want to do is maximize the conformational power and that's the framework of this wasson and uh hempel paper that we did yeah where we're basically just a very simple measure of conformational power it's basically just
absolute value of this confirmation measure that we have. And if you just try to maximize that, then you can just figure out which strategies are going to do that. Now, just to your more broad question. Just speaking a little bit more philosophically here, zooming out a little bit. So as I said, I think this is a very elegant theory of inductive logic. Now, when you're actually applying a theory, you have to construct models. And this is where all the hard work comes in.
You've got to really come up with not necessarily an exact probability distribution, but you have to have enough constraints on your probabilities to be able to decide whether the evidence in your experiment favors one hypothesis over another. And you might not need to give an exact numerical probability distribution over everything, but you'll need enough constraints to determine whether favoring occurs and which direction it goes in. And how do you do that?
It's going to be quite difficult in many cases. It's going to involve a lot of science, measurement, statistics. uh a lot of also just theoretical arguments and just trying to you know it's so it's partly an art modeling is not just a pure science but this is true in all variants of science so what i want to say is inductive logic is no different than any other science it gives you a theory
But in order to apply that theory, you have to construct models. And that's really got to get in the trenches and do a lot of really difficult science, a lot of statistics, a lot of measurement, a lot of idealization. whatever is suited to assessing that argument. And it's going to be case by case. It's going to be...
Each context, we have to do the best we can to come up with the most plausible constraints to tell us what the evidence favors. That's all we can do. So I really think it is a case-by-case thing of constructing models and doing the best we can, just like the rest of science. People often say all models are false, which I agree with.
But that doesn't mean the theories are false. So, you know, when you take general relativity and you try to model actual situations with it, well, what do you do? Well, you have to make all kinds of approximations because you can't solve the equations. And then you've got to make all kinds of auxiliary assumptions.
and all kinds of measurements you've got to do, and you've got to do all kinds of statistics there to figure all these things out and get parameters right and all that. Okay, those models, of course, are false because they all involve idealization and approximation and so on. But the theory might be true.
It certainly could be at least a really good framework for constructing models. And this is how I think of the framework I'm offering for inductive logic. With its two dimensions of assessment, in order to apply it, yeah, you've got to... You've got to fit in some adjustable parameters. You've got to tell me what the premises are, what the conclusion is. And then you've got to tell me enough about the probabilities over those things so that I can get a judgment as to whether the evidence favors.
the conclusion or not you know is is the evidence relevant to the conclusion you may not be able to say how probable the conclusion is but but at least you'd like to say How relevant is the evidence? Get some assessment of how relevant it is. Yeah, I guess I'm trying in real time here and not quite succeeding to put this in very, very down to earth terms. You know, my favorite example.
of a non-frequentist probability is, is the dark matter a weakly interacting massive particle, a WIMP, or is it an axion? That's another candidate for the dark matter. Or is it something else, a third category?
You know, something we haven't thought of before. So obviously this is not a frequentist kind of question, right? This is something that we have some priors we're going to update. But now what I'm presuming is that your way of thinking about this would help me answer the following question. If I had a certain amount of money to build an experiment and one experiment would confirm, like detect the wimp.
Right. Detect that it is that. But the other experiment would like tell me that it is not an axion or something like that. Could I somehow I'm truly not. able to answer the question in real time, but could I somehow judge which is more useful, depending on what my priors were for those different hypotheses? Yeah, I think you could. I mean... What you would need – I mean you may not even need your priors. What you're going to need are the likelihoods. You're going to need –
Okay, how probable is it that we would have observed this evidence given the one hypothesis versus given the other hypothesis? So you're going to have to be able to compare those likelihoods at the very least. That will give you some information about the relevance dimension. Like does the evidence favor one over the other? It may not tell you the probabilities because for that you're going to need priors. But still, it can give you a good amount of information and it can tell you.
something that the experiment's doing something valuable. It's giving you evidence that favors one of those hypotheses over the other, because it's more relevant to one than it is the other. Even if you don't know how probable they are, that's fine. You may not know how probable they are, so you may not know whether to accept.
or reject but you still can say hey this is this evidence seems to favor the one hypothesis over the other and i think that's generally how scientific experiments actually work as i was saying before When you're designing an experiment, you can't determine how probable things are going to be. I mean, you can, given your priors or something, you could yourself determine. But what you can do generally is you can...
You can design the experiment in such a way that it provides evidence that favors one thing over another or is relevant to the experimental question. There is a claim out there that I'm a little sympathetic to, that scientists should be more open about what their priors actually are. Like when we do an experiment, like we turn on the Large Hadron Collider and scientists said, well, we could find all these new particles.
Tell me what the probability is that I will actually find these new particles, which physicists at least never, ever do. I don't know if people in other fields actually do that. Do you think it would be good that they put their money where their mouth is in that way?
Well, the great thing about, so I've been thinking a lot about different sciences because I'm working on this project with a couple of colleagues on the replication crisis in science. And the different sciences are very radically different in terms of how they're dealing with replication and what problems they have. Particle physics is sort of like the gold standard. I mean, the experiments they do, the evidence they generate is so confirmationally powerful.
that it almost doesn't even matter what your priors are. Like, it really doesn't. It basically just swamps completely. There's such large likelihood ratios that you get from those experiments that, you know, come in with whatever prior you want. You're going to basically come out pretty sure.
that these particles exist if you're paying attention to the evidence. And so particle physics is really a great example of where we're designing experiments that are so confirmationally powerful that it almost doesn't even matter what your priors are. Other sciences are not like that. Other sciences, it's much more sensitive to your priors as to what attitude you're going to come out after looking at the experiment. And also it's.
even more controversial whether you have really relevant evidence or not. Even that is controversial in a lot of the special sciences. Whereas in particle physics, no, you know the evidence is very relevant. extremely relevant. And so that's, I view that as kind of one of the easy cases and, you know, like any theory.
It's going to have cases it's really good at explaining and it's going to have anomalous cases. And that goes for the Bayesian theory of inductive logic that I'm offering. It's a pluralist Bayesian. It's not saying you should use. a particular probability, but it's some probability function, right? It's Bayesian in the sense that I'm willing to put probabilities over all the hypotheses, right? Okay, which non-Bayesians aren't willing to do. But in any event...
Look, it's a theory and it's going to have limitations, just like Newton's theory wasn't able to explain, you know, in any really plausible way the motion of Mercury. I'm sure there are going to be cases that we can find in science where the. where the theory I'm offering, it's going to be really challenged to come up with plausible models that explain how much confirmation there is in that case. And, you know, but that's the nature of science. And so even in so I like to think.
There's a spectrum of cases. There's easy cases like particle physics or games of chance. These are easy cases. And then you go down the spectrum and there's really, really much harder cases and much more controversial cases. But that's true pretty much of any science.
Okay, well, I think that we have confirmed that this is a fun thing to talk about, but maybe we haven't because my prior was so big that it wasn't actually relevant, the evidence we collected here. But in any event, Brendan Feitelson, thanks very much for appearing on the Mindscape podcast. Thank you so much, Sean. What a pleasure.