¶ Intro / Opening
Apply now for the Morris Fishbein Fellowship in Medical Editing. This is a unique one-year fellowship offered by JAMA to introduce physicians to all facets of editing and publishing a major medical journal. The application deadline is January 5th, 2026. For more information, including how to apply, I'm Yulin Xuen, Associate Editor of JAMA and JAMA Plus AI, and you're listening to JAMA AI Conversations.
¶ Eric Horvitz's AI Origin Story
My guest today is Dr. Eric Horvitz, Chief Scientific Officer at Microsoft, where he leads initiatives at the intersection of science, technology, and society, with a focus on artificial intelligence, biosciences, and healthcare. His research has advanced the field of AI through innovations in perception, reasoning, and decision-making under uncertainty. And he has spearheaded numerous efforts applying AI to medical and health domains. Thank you so much for being here, Eric.
Oh, it's great to be here with you. Thanks for having me. Well, you have such a long history. You've been in this field when nobody knew what AI was. So can you tell us how you got into this field? What drew you to artificial intelligence? Yeah, it's interesting that what drew me to the field is why I'm here today and my goals and motivations and curiosities haven't changed. So I came to Stanford Medical School.
being greatly influenced by my time in the neurobiology lab during my undergraduate years, I was becoming quite an expert at single neuron studies. I was amazed at the traces. I was seeing these. And my curiosity was like, how do these simple sparks of action potentials and synaptic chemistries that are guided by them, how do they come together to create these fluid inner experiences we have as humans? Our intellect are being situated in the world.
I couldn't explain it with anything that I knew from physics, chemistry, biochemistry, mathematics. So this was and continues to be the biggest scientific mystery of all, from my perspective. And so I came to medicine and the doctoral program in neurosciences at Stanford with a sense that the path would be neuroscience. But I was already on my way for the readings, Herb Simon and other great scholars in AI, the founders of the field.
to shifting my perspective on the approach that would lead to answers to my questions would be looking at and pursuing what I would call the computational foundations of intelligence. That would be foundational to any intelligence, whether biological or engineered. And, of course, these would probably be the same thing at some point. Biology simply discovered some things about how to do this.
So I found myself bicycling over the main campus from the medical school and eventually became a graduate student in AI. And that became more of my central focus in life and my pursuits than even medicine at the time. I think I went through the first two years of medical school while sitting in graduate classes in AI, getting more and more excited that this is what I wanted to do. But I still saw medicine as a great application area. And I also had in the back of my mind.
Boy, if I have an MD too, someday, someday I can come back maybe when neurobiology matures and comes to join AI. And that will be maybe the answer to my deepest curiosities. It turns out that I ended up, as a graduate student in AI, pursuing what were countercultural at the time in the world of logic-based AI and expert systems, if you recall, in the late 80s.
expert systems where the rage in medicine, typically largely rule-based chaining, chaining of logic. And I took a statistical and decision-theoretic approach, Bayesian. Then, again, a little bit... off the mainstream and some tension with advisors and so on who didn't think this was the right way to go. But now these are the foundations of modern AI breakthroughs.
¶ Modern AI Breakthroughs And Mysteries
And so you, I mean, you were so passionate about your first kind of experience, like in undergrad and your ventures into it. Is there anything that excites you? now? Like, have you felt that passion recently where you've gone, oh my goodness, I need to jump into this right away. This is it. This is the next phase. My right of way seems to have always been like...
now, over the years. I've always been excited about developments. I mean, the first Bayesian reasoning systems we built were soon running at the level of experts in medicine. My first area for deep dives was pathology, it turns out. And I was just amazed at how well these models work. Now, in those days, I was really excited about the pieces because we knew with...
a board called Bayesian network models, we knew exactly how every part of it together and how the algorithms worked. What's amazing today, I say that we're now in post-two inflections in AI, the neural net... inflection, which happened around 2010, when we realized that these methods, though, are pretty old. We're simply famished for data all these years, and we gave it to them, and they really delivered in terms of the original.
powers that we saw with the ability to do perceptual work that was really quite leading edge, like interpreting radiographs or dermatologic images. But in the last five years, I would say my heart rate has... Mike explained that it's going a little bit faster on average because of developments that are getting closer to what I would say are the mysteries of mind. Now, I can't make any claims how they relate to the mysteries of human minds.
But we're seeing some really interesting phenomena now that can't be explained and that in some ways are as mysterious as the questions we ask about mammalian nervous systems. with these large-scale generative models, in some ways built on a relatively simple set of ideas. You know, the recent clever ideas were one called self-supervision, the other one is...
Models of how we handle attention and teach these models what to look at when they're thinking. But we're seeing these behaviors, which in some ways are truly showing... sparks of more general intelligence with the abilities to abstract, generalize, and compose. They're polymathic in some ways. I mean, my first engagement with GPT-4, we were one of the earliest teams to look at GPT-4.
Just amazed with the ability of these systems to reason across tasks and disciplines with ease. They could write code, poetry, prove theorems with Shakespearean rhyming. They could solve hard mathematical challenges, do chemistry. solve the hardest problems in computer science. These are what we call combinatorial optimization problems like traveling salesmen. But beyond all this general power, they're also showing the ability to specialize, become expert clinicians.
to do diagnosis, to build differentials, to identify best next test to perform, to discriminate among diseases on the differential. And I have to say that the mysteries of the current models, where we are today at the frontier, and the prospect that there's some relationship with the questions that I've been asking. And we've been asking as a community for decades about MIND. Makes it a very, very exciting time. And we're all pushing hard right now to understand these models.
And I have to say, let me just wrap it back to my action potential and neuron comment earlier. Here we are. If you told me that in 2025... during my days at Stanford grad school, that we'd be like looking at neurons in layers of neural networks with methods that were quite similar to what I was doing with my single neuron probes back in, you know, in the 80s.
I would have just said that, come on, no way. The fact that we're actually trying to say, what's going on? It's called mechanistic interpretability now. But we were actually looking at the values and the changes of activations of neurons now with... kind of like neurobiological methodologies. So it's such an exciting time. So...
¶ Addressing AI Fears And Aspirations
You said a lot about AI, like thinking for itself, you know, so how do you talk to people who are really scared of AI, you know, and where does that kind of lead the scientific discovery, you know, and the scientific method? What does that mean? If we're having these AI tools be able to come up with these formulations and discovery.
Where does that lean for authorship? And you also talked about the fact that we don't even know what's happening. So where does the replicability and reproducibility come into play? That was a lot. Let me just start with the fears and the aspirations. I think with any new technology over the last 150 years, there have been fears about what it might mean from the steam engine to electricity.
I took a picture, it's a really fabulous picture that was in the Computer History Museum in Montana of a sign that was hanging over the doors in hotel rooms that said not to worry about these electric lights. They would not affect health in any way. And don't try to like them with a match. It was like a cute sign. As I said, wow, people were fearing electricity and electric lights as being very odd and maybe influencing on health.
and wellness. But with AI, there are additional factors, especially when it comes to what we see as some, as I said, the sparks of more general intelligence, which could apply and have influence in multiple ways. from jobs in the economy to education to human agency and self-dignity, in that our intellect is most defining of who we are as homo sapiens. And now we see systems that can do things that...
In the past, we're only in the realm of human decision-making and human cognition for things that could be achieved on the planet. So these things could be frightening to people. On the other hand, there's also the... prospect that these systems can help us solve very, very difficult challenges to humanity, like climate and sustainability, giving us new chemistries and catalytics to more cheaply remove carbon from the atmosphere, for example, among other.
directions. I think the biggest and most exciting applications of AI technologies will be in the biosciences and in medicine. One of my hopes is that we will see AI called out as being responsible for helping us to transform many cancers into chronic diseases in our lifetimes. to understand autoimmunity and so on. I think that the ability to not just understand, but to generate new kinds of therapeutics with protein design.
tools, for example, is promising. So I like to ask the question of experts and the public more generally, how can we leverage what we see as the powers of these AI technologies for human flourishing?
¶ Navigating AI's Risks And Rough Edges
What will it take? At the same time, we have to keep our eye on rough edges, and there are a number of rough edges, even beyond what I said about the influence on education and self-identity and so on for humans. These include some near-term... hard problems we're seeing. For example, when we see AI systems start generating content that make it difficult to discriminate fiction from reality, there are big implications.
for our understandings of the world, for democracy. We risk entering a post-epistemic world, I like to say sometimes, with these technologies, with their ability to, with high fidelity, synthesize alternate realities. or the persuasive messages that some humans want to communicate to others as part of disinformation campaigns. That's one area. Another area that I'm concerned about is AI and biosecurity. We just had a paper come out in Science on this topic.
probed the challenge and worked on mitigation. That's just, we have to stay on that in the future. There are several really interesting challenges when it comes to the long-term. of independence in our thinking and agency in our clarity of thinking. My sister teaches literature at Asheville. She's a professor there. And at Thanksgiving, she showed up at my home with her hands on her hips.
saying what are you doing to my students they're not thinking and writing with their own minds anymore they're just pulling down a tab and asking for an essay what does this mean for the preparedness of our kids and grandkids and great-grandkids for problem-solving independent thinking. To continue on this path, there are issues I do think about, again, this idea of thinking for ourselves. call it cognitive complacency. Like, I just don't know whether or not we are really...
Again, thinking for ourselves, because it's even very different than when you used to use a search engine and you would kind of have to like input information, you'd have to research it. But now. People are just kind of asking AI questions all the way from what should I eat today? What do I wear? What should I do to solve this problem? Yes, but also it can be.
dangerous in a way where you may not be fully again, you're on this type of autopilot. And so bringing that up, I know you've kind of talked about aviation safety. How can we have these type of safeguards where we ensure that AI doesn't make false information, doesn't take over our minds? Do you have any thoughts about that?
¶ Ensuring AI Safety And Reliability
Uses of AI systems in high-stakes areas, like medicine, for the modern tools right now, the AI technologies that people are focusing on and are excited about. We don't have yet strong methods for calibrating confidence of the output of these systems. Calibration, well-calibrated probabilities, specificities and sensitivities, areas under the curve. in our studies, for example, were
part and parcel of the Bayesian paradigm of AI. You know, when you got a recommendation or a diagnosis, you could see likelihoods, uncertainties and have well-characterized competence. So we can have a... establish a level of trust in the output of these systems. We're struggling today with the current systems, as powerful as they are, they're hard to characterize. It's funny because when I give talks about, and I gave a grand rounds recently,
Talk about the threat which came out of the statistical paradigm leading into the Beijing network paradigm and beyond, kind of called traditional machine learning. We have fairly well-characterized tools and methods. All of a sudden... Entering the scene and grabbing all attention and excitement is this glob of blue-green gas in our hands, which has all these interesting powers, and it does really well on various kinds of benchmarks, like USMLE.
Step one, step two, step three, like amazing, remarkable. But it's hard to characterize. When I give a talk on these methods, I'll show how to a medical audience, like a department of medicine, how powerful these tools are, what it can do. And I always like slipping in an example that shows a devastating failure, like devastating and hard to recognize without several physician experts saying, well, wait a minute.
Why did it do that? And then you realize that this could have even led to, could have killed the patient. Because I think it's, we need to understand these systems. We are relying today largely on empirical studies, randomized. controlled trials. And we do have statistical output of how well these systems can work in different settings, whether it be the emergency department, you know, on the floor, the ICU and so on. And I think we can start to leverage and harness.
The abilities of these systems, for example, to help us in a collaborative sense with diagnosis, with care management, and so on. There's work to be done on the calibration of the output of these systems.
¶ AI, Scientific Integrity, And Creation
requiring careful study and then new methods to be developed. In the sciences, and then you mentioned scientific integrity, You know, we celebrate scientists who come up with deep insights and pursue them and validate their results and come up with valuable, useful inferences, artifacts, technologies. What does it mean when scientists begin using tools that can generate hypotheses, help deeply with scientific reasoning and planning of experimentation, for example, and even executing experiments?
of the form that we would give out awards, prizes, accommodations to human scientists in the past. Well, how do we give attribution to models and ensure that When there's a human AI experience in the sciences, for example, we understand the role of an AI system. On replicability...
I don't know if people have noticed, but sometimes models are changing even when they have the same name. In a recent paper that I co-authored that appeared in the PNAS just about a year and a half ago, entitled Scientific Integrity in a... era of generative AI, we call that a set of recommendations and needs and caveats about the use of AI tools in the sciences. One of them said, well, you know, we have to call out
manufacturers, the creators of models, to provide new kinds of tools that will propagate through and provide end users with provenance about the source of ideas. If there are really interesting ideas in a system... that are really encoded without reference back to where they came from, human intellect, then it's in some ways a failure of the technology when a scientist uses these systems.
believes there's a new idea that he or she has created and should be credited with. Well, there are technologies being discussed right now that would make that kind of thing part of how the models work. And then if you use a model... shouldn't that model with its name and its training and its abilities be forever available to the scientific community for replication?
And for studies that would say that, you know, I want to see what this person did with this tool. Imagine if mass spec machines were changing all the time. in a poorly characterized way, how could you refer to a mask spec being used in a scientific study? Maybe those were two of several recommendations we made in that paper.
to responsibilities of model creators and responsibilities of model and users. I think the term provenance is really important. I mean, that's our whole field. It's kind of like the citation reference, right? It's like, where is it from, right? Who created it? Doing a book cover, for instance, you can't use AI to generate it because you don't know where the image that they used came from.
So it's not original art. And so you can't reference it. So they could take a piece from, you know, Picasso. I don't believe still that it's really generating new content. It's really piecing together content. At least when I see it, when it's. developing art because I can see where it's kind of gone, its technique or its style or like certain references from the art that it's creating.
which then is not fully original. It's not like we don't take inspiration, but I think inspiration is very different than, again, plagiarism or copying, I guess. Well, let me say that, look, we are living...
¶ Long-Term Societal And Ethical AI
for better or for worse, in a very transformative time. I think that looking back at this time, 200 years from now, this... particular set of inflections and how we as a society are grappling with the rise of tools of intellect, referred to as the constellation of AI technologies, this time will have a name. I don't know what it'll be called yet, but there are implications for copyright, for human agency, for our self-identity as creators and as leading intellects on the planet, as scientists.
as healthcare practitioners and other experts, I think that there'll be a grappling with, for example, back to your comments about what does it mean to create? How does... the intention of copyright law in different nations. How does it carry through and how should it be updated in how we consider the source of creative output with the march of these technologies?
What's the gray zone between the composition of ideas and artwork, which I think, as you say, is part of the inspiration of our leading creatives? What does it mean when AI starts doing that and providing... these compositions to humans? What is the limits of provenance and attribution? And what should be expected and what should become part of what we consider appropriate? I think these questions are all on the table right now.
So, okay. So then if you could tell an AI scientist in 21, 25, so in a hundred years, what questions they should be answering or what you think. they need to have been doing in the last hundred years, what would you say? Well, I think that one of the sets of questions would be on, I would call them, Questions in the realm that I refer to these days as deep currents. When I say deep currents, I mean the dark matter of the societal impact of these technologies.
and social impact that is hard to characterize. I think I mentioned some of these words before. These are the longer, profound, longer-term influences with the psychological, the social, the cultural impacts of AI technologies. And again, I'll say human self-identity, human agency, our potential over-reliance on AI, our psychological dependence on AI as companion systems.
as well as other emerging issues associated with what I would say is our new forms of intelligence systems in our daily lives. And I think we're going to need interdisciplinary research efforts to both anticipate and to address these over time. Many of these are going to be more like frog in the frying pan situations. They're not immediately apparent or measurable. But over time, I think they're going to be important looking.
Now, you might know that I was involved in one of my passion projects was creating the 100-Year Study on AI at Stanford, which is, it's called 100-Year Study to keep it concrete, but in reality, it's an endowment. that we have a promise from Stanford University will fund a set of experts to come together and write a report every five years for as long as Stanford exists, asking sets of questions, seeking guidance.
seeking to guide and to provide recommendations on next steps for AI and where it's going. And I refer people to both the framing memo I wrote in 2013 or so, where I call that 18 topics. that I think have stood the test of time. And I think they'll still be the topics we care about in 21, what was the year you mentioned? 21, 25. 21, 25. Well.
Put it this way, John Hennessey, the president of Stanford, when I set up this endowment and defined this study, which now has done two reports and a third one coming out very shortly, he said, I guarantee you there will be a report, and you think you get a year like 2,200, there will be a report. Should Stanford be around? So it'll say something. And what the 100-year study leaders came up with were a set of standing questions.
The first question is really interesting if you look at the 21 report. It was, what are some examples of pictures that reflect important progress in AI and its influences? And they put photos in. for like, you know, it's almost like a time capsule of what people were seeing now in the world. And can you imagine if that question is still around? Give us photos of visually what's going on in the world with the influence of AI and people.
Another question which is dear to my heart for the first few sentences of our interaction today, how much have we progressed in understanding the key mysteries of human intelligence? And a third question that they ask every report, what are the most inspiring, open, grand challenges for AI? Now, in 2125, it would be interesting to get an answer to that question.
It's not like everything will be answered then, and so on. So I refer people to the AI100 reports, ai100.stanford.edu, to look at both the framing memo, back with the 18 topics. I would modify them now a little bit, but not that much. They're still relevant for the, I said here, long-term questions that we have as funders and as study authors. Let me just say one thing from the point of view of where we are now. We will be surprised.
¶ Concluding Thoughts And Podcast Outro
Okay. Well, I think we will too. I think that we, I also really liked your points about society. I think that's the biggest change. We didn't see social media coming and cultural shifts is huge. Obviously, there's a traditional scientific discovery, but the social changes that we're going to see, we're going to be surprised. And we just don't know. Well, thank you so much for joining me today, Eric. It's been great chatting.
I am Yulin Chwin, Associate Editor at JAMA and JAMA Plus AI, and I've been speaking with Dr. Eric Horvitz on the role of AI for innovation in perception, reasoning, and decision-making under uncertainty. You can find a link to the article in this episode's description. And for more content like this, please visit our new JAMA Plus AI channel at jamaai.org. To follow this and other network podcasts,
please visit us online at jammanetworkaudio.com or search for Jamman Network wherever you get your podcasts. This episode was produced by Daniel Moreau at the Jamman Network. Thanks for listening. This content is protected by copyright by the American Medical Association with all rights reserved, including those for text and data mining, AI training, and similar technologies.
