Ep 63: Khan Academy Founder/CEO on Salman Khan on Classrooms in 20 years, Rolling out to 1.4M Users

⁠¶ Intro

00:00

Salman Khan is the founder and CEO of Khan Academy, one of the most influential education platforms in the world. They serve over 150 million learners across 190 countries. And they've also been doing some really interesting stuff in AI with Khan Mega.

00:11

their AI-powered tutoring assistant. It's already been deployed to over 1.4 million students and teachers. And Salman and I had a really interesting conversation around how AI is transforming education for students, teachers, and school systems. We talked about where these models work, where they don't,

00:26

and what it takes to build AI that actually makes a difference in the classrooms. Some of my favorite parts included why proactive AI is the next evolution of tutoring, some surprising ways students and teachers are using AI, what it takes to reduce hallucinations and math errors to near zero,

00:39

And how AI is reshaping engagement and assessment. This was an awesome conversation for someone who's really doing one of the widest deployments of AI today in the real world for a really good mission. Before we get to the episode, I just want to say a huge thank you to all those that rated the show on Spotify. If you're enjoying this on either of those platforms and haven't yet left a rating, please consider doing it. Now here's Salman Khan.

01:02

Well, thanks so much for uh for coming on the podcast. Really appreciate it.

01:06

Thanks for having me.

01:06

Yeah, this is gonna be a a fun one and I figured I would start with the most like overly broad question I could, uh which I'm sure our listeners are curious about. Given all the like rapid advances in these models and kind of what you've been building uh on top of them, what do you what's your current thinking on what a classroom looks like, you know, twenty years from now?

⁠¶ The Vision for Future Classrooms

01:24

Yeah, it's good a good question. You know, I think if if we had this conversation five years ago I would have thought that's a boring question. And now I think that's like too far out because things are changing so fast.

01:34

Always hard to find the right time period in the in AI worlds.

01:37

I'll tell you what I'd like to think a classroom looks like in twenty years or even in five years, but okay, let's go with your twenty. Um I'd like to think that more classrooms are actually going to look like what great classrooms already look like. A great classroom today is one where

01:53

Um students are not just passively listening to a lecture. They're engaged. They're interacting with each other. They're actually doing things, whether it's problem solving, working in groups, giving presentations. uh in a great classroom today, a teacher um is walking around, um, working with the students, make getting them to do interesting things. Now, the reason why I think AI is potentially going to play a role there is right now those great teachers

02:24

They have to spend a lot of time making great lesson plans and being very creative about it. Um, these great teachers are able to do that even though they still have hours of grading and lesson planning and progress report writing.

02:35

These great teachers um are the ones that can really observe a classroom and almost intuitively understand what students might need. And those are superpowers. And I think there's a world where artificial intelligence can give more teachers, all teachers uh some time back on their planning side of things, giving them better insights.

02:59

on how do they uh uh well m some ideas and insights on how they can better manage their classroom, better insights on where the students are at any point in time, better interactivity with their students. Uh so You know, when these students are all doing breakouts, and I can imagine in twenty years for sure, it's not going to necessarily be on your a laptop. Um I think one of the powerful things about where generative AI is going is there's no reason why it can't be ambient.

03:26

And it's just observing the classroom and seeing what's going on. Um, and so I think you you'll have right now people try to think very binary. Okay, technology is you're just staring at a screen, non-technology is you're running around in the real world. I I don't think th not necessarily has to be a trade off in twenty years. I think in twenty years you're also going to have a whole other aspect, which is which sounds science fiction.

03:48

But I think that's the point at which virtual reality, augmented reality will become very mainstream. uh and that with generative AI and super intelligence at that who knows? Um you're you're going to be able to immersively go into simulations, be part of virtual worlds, go back to ancient Rome and try to stop or maybe help uh hasten the the assassination of Julius Caesar. W well whatever it might be, but you know, that it's ki literally like a magic school bus ride.

04:16

Yeah. No, I mean a lot of really uh compelling threads to to pull on there. I mean I think especially w as you talk about kind of giving teachers superpowers and and kind of augmenting what the best ones already are able to do.

04:25

Probably a good segue just to talk a little bit about, you know, what you've been doing with Con Migo and the efforts um, you know, t at Con Academy uh Con Academy building AI products. Could you just give a little bit of context for our listeners on, you know, what you guys have built to date, how you've been thinking about this? Yeah.

⁠¶ Khan Academy's AI Initiatives

04:38

It's interesting because I think in this time of rapid change it's always important to have like, you know, what what are we in this to begin with for? Like what what's our true north? And Khan Academy's true north, I I I've been articulating it this

04:50

much more clearly actually ever since Gender of AI became a thing. But if you go back to the early days of Khan Academy and it was just a hobby, I was tutoring family members, then I started writing software. I started making videos. Obviously Khan Academy became much more than me, but everything we've done over the last

05:05

almost 20 years is trying to uh approximate or replicate some of that personalization uh that a um a good tutor would do. Um, you know, not only make hopefully high quality materials available, but make it so that it can be, you know, Personalized to the student, they can get practice. Teachers are an important part of it. How does a teacher with 30 kids in their classroom personalize it more? So we've always been doing that.

05:29

And uh even in the early days of Cod Academy, I used to cite uh Diamond Age, the Neil Stevenson book on Young Ladies Illustrated Primer. I was like, We're gonna build that one or this is what we are building. It's just we're gonna be doing it incrementally as the technology gets better. So when we saw what was possible with the latest generation of um you know, in particular GPT four, which OpenAI gave us access to months before even Chat GPT existed.

05:53

It had issues, but we said, okay, this can this is going to be able to approximate tutoring. It's going to be able to approximate teaching assistance and and other things that that we probably hadn't conceptualized before. So that's how we launched Conmigo, which is our AI assistant. as a tutor and a teaching assistant, putting guardrails so that teachers can see what students are up to. It won't cheat. It

06:13

um you know safety, privacy, etc. Um it's much more Socratic. Uh so we're really trying to lean into the let's make this really good pedagogy. Let's make this safe. Um but let's also let's also make it useful. I think what we've We've had a lot of successes. Um when we launched, I thought maybe by twenty twenty five we'd have maybe a hundred thousand folks using it as a pilot. It's now pushing about one point three, one point four million um teachers and students. And these are these are

06:41

You know, our mission is free world class education for anyone anywhere, but we've had to charge these districts for it uh because of the com compute cost and the support and the training that we've been giving. We we charge them about fifteen dollars a year. So this is, you know, to have one point five three one point four million districts paying within a year and a half and and the interest continues to be very, very strong. So um one thing that we are doing now

⁠¶ Proactive AI and Engagement

07:07

is realizing that the next phase is making the AI much more proactive. If if even if I walked into any um math classroom or any classroom and I said, Hey, I'm a great tutor. I'm here in the back of the room. If you have any questions, come ask me.

07:20

Those probably gonna be about ten or fifteen percent of kids who do it. Um, and we're seeing that with the AI. Uh so uh this next version of Khan Academy you're going to see, we're gonna start piloting it in back to school. We're calling it Khan Academy Classroom. But it's from the student point of view a much more proactive AI that every time you go to a Kyle Academy is like, Oh, welcome back, Jacob.

07:39

Hey, it's been a little while, hey here's what your teacher wants you to do. Hey, how can I help you here? And from the teacher point of view, same thing. Much more like a concierge front and center as opposed to being just something to be to ask.

07:50

Yeah, it's so interesting'cause I feel like, you know, across AI products there's like this blank screen problem where you get you can you get to a place where you can prompt anything, but it's like how do you figure out actually what what the right thing to prompt is?

08:00

How how have you kind of thought about, you know, just like starting to teach students and teachers as you've as you kinda rolled out? I think Newark was one of the places where you did a huge rollout like What have you learned, I guess, about um, you know, teaching folks how to start using these products and and, you know, kind of getting started from a cold start?

08:16

Yeah, I I it's it's a very real thing. Um and it is where you know, we've done th we did things like little dynamic action bubbles where we're suggesting things that people might want to try next. Obviously there's some training. um things like that. But I think the core is is making the AI much more proactive uh about things. The other thing I'll say, uh, because it's important right now, I there's probably five hundred people who are

08:41

claiming to make some version of an AI tutor in some in some way, shape or form. I I don't think the AI is quite ready yet up to be by itself and drive learning for most people. I think if you're a curious person, you could go to ChatGPT and if you spend half an hour every night with Chat GPT and just you're really good at prompting it and asking it questions, you can learn a ton. But for most people, that isn't most students. And so and and the AIs aren't great yet at creating a

09:12

high quality questions that are not gonna have errors that can give you they're getting better every day, but they're still not where they need to be. So You know, a lot of what we see in places like Newark is the efficacy, and we really are seeing some amazing efficacy numbers coming out of there. It really happens from doing the traditional practice on Khan Academy. And the AI there is a is a support. It's something to help drive engagement. And

09:36

A lot of people in education and ed tech always think about, oh, can I come come up with a more efficacious intervention? And that's obviously matters, but it turns out if you engage with anything reasonably That seems reasonably healthy. It's probably efficacious. The hard part's the engagement.

09:52

And um this is where we are, you know, we're we're we're trying to look at every dimension of that pipeline, all the way from how do you get a teacher activated faster and and where's the AI there? How do you get the district's uh administrators?

10:06

in a polite way holding the classrooms accountable, that like, okay, you're really engaged in this tool we're using. The students are really using the AI. And then of course, how does the the district leaders and the teachers hold the students accountable? These types of human systems and how the AI can help the humans hold other humans accountable is actually how you get engagement.

10:25

Yeah, it's such an interesting point that I think engagement is actually like really the thing you need to solve for. Um, you know, and on on the student side, I'm curious, like, obviously you've kind of released this now to a a ton of people using it.

⁠¶ Teacher and Student Experiences

10:35

Any like surprises in the way that it's been used? I thought I mean I've heard you speak on another podcast about um, you know, kind of obviously there's uh some of the shortcomings of these models is they aren't always right. And you found that like students were explaining their reasoning to the models and then the models were like

10:49

uh iterating on that reasoning and actually was like a really interesting way for like students to learn and the models to actually correct themselves uh with the mistakes they're making. Like anything else you've noticed uh in in the way people use this that might have surprised you?

11:00

Well there's definitely cases I don't I th these aren't mainstream, but I said if you're a really motivated person and you engage properly with these mod there's some amazing magical things that can happen. I gave an example uh this was a TED talk two years ago I gave an example, but it's still a fun story of this uh young young woman at Con Con World School, which is an online charter school we have and she was in India at the time and she um

11:23

was reading about the great Gatsby and we have an activity where you can talk to AI simulations of literary characters and she had like a a lengthy conversation with AI J Gatsby uh about every and you know, our our um Simulations, they don't just try to answer questions. They actually try to drive the conversation. So what have you thought of? Are there things like in your life that you wanna and um

11:45

I I remember s her telling me about that interaction and and thinking, Okay, this this is beautiful. This is the kind of thing that We want to see more of. I I have I I've talked to my son a lot, my oldest, who's uh just turned 16, and you know, he's he's

12:01

He's he's doing some pretty advanced math now. He's he's probably at my level or maybe even a little ahead of me at this point. Um and and he's actually using it all the time uh to really explore ideas. And I'm like, Is it a hundred percent right all the time? He's like, No, but neither are you. And yeah. So so um and and you know, we one of the things that we are putting a lot of resources, and I don't think a lot of other people are, are really trying to not just

12:27

improve the AI's accuracy, but measure how good it is. And as far as we can tell right now, Conmigo when it's anchored on Khan Academy content, you're at about a two percent error rate. Um and that two percent error rate's about split evenly between just a straight one percent of the time it's a math error and one percent of the time it's an evaluation error. So Maybe the answer is one third, you put 0.33 and it said, great job. It should have said, well, close.

12:53

Are you sure that you know, do the threes keep going or can you represent that as a fraction? Um and y we we obviously would love to get it to a zero percent error, but I actually think my son is right. It's actually already when I'm tutoring my own kids now

13:07

It's probably one out of every ten or fifteen times that I'm I'm like, Wait, wait, that's not what I got. Wait, let's do that again And I'm like, Oh yeah, you're right, you're right, you're right. You know, and so um I I actually think the error rate's actually already better than a lot of uh human tutors.

13:22

What about on the teacher side? Like, you know, how have you seen, you know, the best teachers leverage these tools? What's kind of the overall reaction been to to some of these things, you know, entering the classroom? I mean obviously I guess uh in some senses Through ChatGPT's broad release, they've entered the classrooms regardless of of whether it's through Conmigo or not. And so what have you kind of noticed on the ground there?

13:39

Yeah, the the ideal and this is maybe not quite as sexy. I mean there's some sexy use cases from from teachers too, but the ideal is they just use it regularly. Um, they create a habit around it. where they're working with the AI, helping tweak their lesson plans, making it a little bit more entertaining, right size for the classroom, then delivering it in the classroom. We have a partnership with Blue Kit, um, for those, you know, fo

14:02

Blue Kit's, I believe, as popular or more popular than Kahoot. It's like a in in in-class gaming uh based on question sets. So our partnership with BlueKit is Conmigo will generate the BlueKit questions for the teacher. Uh so what used to take a teacher maybe half an hour, hour to write fifteen, twenty questions or more, um, it they can now do with like in about like two minutes. Um

14:24

And we're seeing a lot of great things there. So we're seeing the planning, the delivery inside of the classroom, and then the getting insights from it. And then that's for another wave of planning. And there's a lot of things that we're building for this coming year that will make that much more streamlined and integrated.

14:38

But a teacher who create has that kind of habit for me, it's always been the case, who's using Kind Academy to make assignments, hold the students accountable, look at the data, and then keep doing that, they they seem to get very, very good results. Uh once again, it's it's it's all about engagement. In terms of

14:54

Sexier use cases. I mean we've definitely seen teachers talking about you know AI simulations. They're definitely opening up a class and saying, all right, everyone, we're gonna we're gonna talk to AI simulation of Harriet Tubman or George Washington. Ask your hardest questions. Um, which is really which is really engaging. Um, we are seeing we have another tool called Writing Coach, which is our answer to the fears around cheating, et cetera, where

15:20

The teacher assigned creates the assignment with the AI, assigns through the AI. Students do it with the AI, but the AI really acts as an ethical writing coach. And Then when the student submits, the teacher doesn't just get the final output, they get the process, and they can talk to the AI about what went on and if

15:37

If I copy and paste something from Chat GPT, the AI says I don't know where this came from. So it actually undermines not just AI cheating. We believe all forms of cheating. But teachers are do are are are starting to use that um reasonably regularly. for writing assignments. So that one's I I'm I'm pretty excited about.

15:53

I love that approach for just like rather than, you know, say, Hey, we're gonna just ban these tools completely, it's a A unrealistic and B not how people are actually gonna interact in the world. It's like let's find a way to teach people to use them in a way that still develops their own thinking.

16:04

Um, I think it's it's really it's really clever. Do you imagine that uh being true across other subjects as well? Or like obviously uh in writing it makes sense, you know, given that in the workforce people are going to use these tools going forward, do you think uh Like most of schooling should involve is using leveraging them to some extent, or are there still places where they should be uh you know, you should be able to do math and writing uh completely without them?

16:25

I I think it's gotta be both. Um, I definitely think if you're going to be managing an AI to do some of your work, you need to be able to write well yourself. So I think especially if we're pre high school or even early phases of high school. Yes, you should do some more writing inside of the classroom, uh, with the teacher there. It could be a little short form writing, etcetera. Or maybe you do it over multiple class periods. That I think is healthy.

16:50

Um, at the same time, especially once you get into high school and college, yes, you should have more opportunities to use the tools to get more productive. Uh you know, I have to give a um a a few commencement addresses later in a in a couple of months and I was like, Oh, I've I better work on this. And I was in an airport and all I did is I I I recorded my my thoughts.

17:13

my like life advice. You know, like and then I I had it transcribed by an AI and then I had it and then I had it turned into a um a first draft. And that first draft, I mean it speaks to both things. That saved me a ton of time that I I would have not otherwise done. But if I didn't know how to write and if I just took that first draft that just came out of my like my my random musings. It's a horrible speech.

17:38

Um, but it had it had s the essence of some of the ideas. Like I was just looking at it last night and I was like, wow, you know, this one way of phrasing it was actually quite beautiful. And I I'm gonna tweak it a little bit'cause it's not exactly how it sounds like me. But it it going from zero to one and really it went from it went from point three to like point seven, right?'Cause I I gave it um it these are my thoughts. Um

18:02

But these things are are are massive accelerants and people should learn how to how to do this. I yeah, just speaking out loud I I can imagine uh in the when you and I were in school, the term paper they say you have two weeks. What if a teacher says, um You know, your opinion on this matter, but by the end of the class period, I wanna I wanna see some output.

18:21

What about on the on the district level? I feel like in other categories, obviously we're seeing tremendous pressure from boards, from, you know, uh CEOs, like, hey, let's let's adopt uh AI as fast as we can. Um, you know, obviously the the schooling system is is its own kind of unique universe.

⁠¶ District-Level Adoption and Policy

18:35

Does it feel like there's that same kind of pressure for uptake or or momentum or what have you kind of observed at like the uh the policy and and kind of district level?

18:43

I think generally speaking, s this is one of the cases where Schools might be one of the first places where you see mainstream adoption of AI for productivity and and learning out and and just, you know, doing the day to day work, which is really trying to help kids learn. Um As we you already mentioned, teachers have already leaned into this because so much of what teachers do can be streamlined with AI, um, especially on the planning side of things and the grading side of things. And

19:14

d as the models get better and can support students better and get more proactive, I think everyone sees it. And Yes, there is compute cost, et cetera, but it's dramatically cheaper than anything else that's come before. After the pandemic, there was eighty six billion dollars that was spent on ESSER, these these funds to help kids remediate.

19:34

And that you know, that's like two thousand something dollars per American student. And m a lot of districts plowed into fairly expensive paid tutoring, like live tutoring, and they're some exceptions, but for the most part there's not much to be shown for it. Um, you know, so instead of something that was costing twenty five, fifty dollars an hour, you're now looking at something that costs ten, fifteen dollars a year.

19:59

Um, and and you get much more dosage if you want to. So yes, I I I think I I'm actually seeing more in in school districts than I'm seeing as a leader of Khan Academy. And you know, I've been pushing the Khan Academy team. I was like You know, what when are we going to be able to get, you know, automate some of our bookkeeping? Or when can we do this on this? Or when can we do, you know, I'm constantly pushing the engineers on.

20:23

How much more productive are you getting uh with the coding? I heard that Company X is, you know, hundred percent. Why can't we be a hundred percent? But um yeah, I think schools are schools uh th there's there's school districts we've talked about with they are They're saying it's saving their teachers at least five hours a week, if not more. Um they're using it as a recruiting tool, retention tool.

20:42

Yeah, it's really powerful. I mean I guess from a from a society perspective, I'm glad that schools have been uh one of the fast adopters, uh if if any place was gonna was gonna be.

20:50

You know, I guess I'm curious in the process of building Con Migo, I mean you talked about obviously you had to add guardrails in, it sounds like you've, you know, tied uh the models to your own content that you've already had on Con Academy. Like Anything else you needed to build on top of the open AI models to like make this work uh in the in the classroom setting?

21:06

Oh yeah, I well I mean I could go down the list even you know it it's it is surprising where on one level a lot of these AI applications and unfortunately I think many of them are are just kind of thin prompting layers on top of a on top of a model. But um yes, you want to do the safety, you want to do the moderation. Moderation is something that frankly we probably were overly conservative to begin with.

21:29

sense given your your your audience.

21:31

Given our audience, uh, but then, you know, uh there were a lot of false positives. So I I think we we now have the handle on ha have the handle on that one. Um, the math math accuracy in particular, this is where there's a lot of work. I mean, just to get the error rate low. And once again, the the hardest errors aren't can this thing

21:52

figure out what five to the eighth power is, the hardest errors are evaluation errors with the students, especially when there are certain students who are just hammering it, trying to get an answer. And they're sw they're they keep c switching context, et cetera, et cetera. And so how do you

22:05

um handle handle all of that. Um a lot of work just making the user interface something that feels more natural. Obviously, as I mentioned earlier, we're doing this whole reengineering of the front end of Khan Academy. To imagine a more proactive AI first approach. And I think when people see that, they're going to see like, okay, this is.

22:25

This is not just a chat bot. This is the AI integrating it in every aspect of what it the writing coach we've been doing, you know, that's where once again it's not just a chat bot. There's a there's a brainstorming on your thesis statement. There's an outlining tool that you know the AI can see and has context. And then when you're drafting it really is kind of, you know, like you're on a Google Doc and it's highlighting parts of it based on the dimensions. Um

22:50

So yes, we th there's a there's a lot to be done that's that's well above and beyond just a thin prompting layer. Um there's a lot of um You know, I think the world now doesn't want twenty different apps on on a AI tool. They want an app that's smart enough to figure out and behind the scenes do a little bit of prompt chaining and swap prompts out. So there's a lot of work there and making sure that that's uh robust memory. Um I could keep going.

23:19

Yeah. What are like the the capabilities that are meaningful to you that are that are still on the cum? Like things that you, you know, uh breakthroughs, I guess, you know, either in the core LMs or some of these multimodal capabilities that would be, you know, kind of game changing in what you're able to do in the product?

23:32

Yeah, I mean memory is uh a big one. I know right now you could go to Chat GPT and ask it about you know apparently it remembers everything you've yeah you've done. But I yeah, for for people like us to have ac access to that so that we can give them the models much more context. It has all the memory, but at the same time maybe there's ways to reset some aspects of that memory. Um, know what it's remembering or inferring about you. But I think memory's a big one.

23:58

The advanced voice, which is out there, you can use it already in in Chat GPT and Gemini and all of that. Um I think integrating that with our platform is going to be pretty cool. I we are building for a world where over time The hum the the Con Academy written content might become less and less relevant. Uh you know, that breaks my heart as someone who's made seven thousand videos, but um it's just a reality. And so

24:27

They're all being used to train the models anyway, right? So they're still they're still in there.

24:30

They've had some utility. for the models to create higher quality questions, um, I think is big. You know, I I think there there's some very cool capabilities of models, like the ability to make videos and certain types of images and all of that. Those are fun. I I I don't think the world has come up with really good pedagogical use cases for them just yet. I mean, there could be some fun project based learning.

24:55

things like that. But uh yeah, the the the pedagogical side hasn't been Well oh well well one on image we we do hope this is probably about a year and a half out, um For students to be able to, especially if they're on a tablet device, be able to show their work and the AI being able to see their work.

25:13

I I think you had your son demo that right with the uh

25:15

Yeah.

25:16

Yeah, yeah.

25:16

Yeah. Well that that that open AI demo from last year, um Which, you know, in fairness to the demo gods, that was a fifth take. It was making a ton of errors, but it eventually did work. But that idea of the AI very naturally being able to talk while seeing your work.

25:35

um and giving you feedback, yeah, I think that's that feels almost what if it's robust, it feels indistinguishable from when I was tutoring my cousins or now I'm tutoring my kids. So I'm hoping if in the next year and a half, two years we can make that a mainstream thing, that that'd be a big deal.

25:52

deal. Yeah. One thing I'm struck by when you alluded obviously to like there's kind of like fun use cases one could explore and, you know, to your earlier point that like so much of this is around engagement and like, you know, a lot of the techniques are there's a bunch of different techniques and a lot of them are effective and it's really about just getting people to engage with any of them.

⁠¶ Gamification and Engagement

26:05

Uh I'm curious to see whether we see some, you know, interesting, kind of just like fun uh new ways that we haven't conceptualized that actually drive uh you know, drive people to engage and then obviously you can layer some of this other stuff on top of it.

26:16

Yeah, I I and you know I'm always uh we've been gamifying a bit over the years and obviously there's People in EdTech who've done very good jobs, people like Duolingo. Yeah, Duolingo's have the benefit of they kinda make up the standards, right? And they can they can they can they can they can make those dopamine hits as long as as

26:33

as much as you want. It w when we started we were on that same journey, but then as soon as we became more mainstream in schools, we're like, okay, we have to align to the standards. And some of these standards are not the most pleasant things for some students. Uh

26:46

And and and so there's there's always going to be, you know, if you really wanna do and learning isn't always easy. So there is always that like productive struggle uh you have you have to do. Now I do think if we can be really creative at about about like quest based learning and

26:59

Uh y y when people go on scavenger hunts or escape rooms, they are willing to solve pretty hard, cognitively challenging problems because it's part of a broader game. So I think there might be something there, especially if the AIs get good enough. Um my ten year old's uh his last birthday when he turned ten, um

27:15

I made a scavenger hunt for him and I used uh, you know, the reasoning model to come up with most of it. It c it didn't do it out of the box. I had to tweak it a lot. But um, you know, in a few years you can imagine it being able to create games out of these types of things.

27:30

Totally. Do you have a go to thing you use to test the new models? I remember I think Famously when you got showed GPT four, you were using, I think, the AP bio exam to to figure out uh whether these models were any good. Like what do you use now as as like a new model comes out from you know Google or OpenAI or Anthropic to to figure out um you know how how good it is?

⁠¶ Evaluating AI Models

27:46

You know, some of the things that we were worried about a year or two ago, I don't I I I don't worry about anymore. I th my gut sense is, um, and you know, the data seems to back this up, that all of the the Gemini, the Clauds, the Grocs, and the you know GPT four point five, that they're all in the same

28:07

l plus or minus, some would say they're uh you know, one's better than another. But even that's kind of unprompted. I think if you prompt you could get almost identical behavior out of all of them. Um So now it it really is There's obviously a cost performance, but it it it's the the differences between them are subtle enough that I I have to lean more on our own evaluation framework.

28:32

To saying, okay, out of out of our tough test cases, how many of this is succeeding or failing, etc., uh, versus just you know me playing around with it.

28:40

Yeah. How how do you guys do model evaluation today? And like any any learnings on that since getting started?

28:45

Oh, well, there's a bunch of, um... We we have we have a series of tough test cases uh that From the beginning we saw and it's now several hundred cases that we know these models have had trouble with. And, you know, they are some of these classic point three three

29:06

is what the student says. One third's the answer. Uh distributive property is straight is is one of these things that sometimes the models at back in the day at least had trouble. So anyway, we have the whole list of it. When we started this of the let's call it two hundred test cases

29:21

It it was actually failing on about seventy percent of those test cases. Now these are very niche. It's not like it fails on seventy percent of interactions, but it was seventy percent of the hard test cases. Now those numbers, I haven't looked at it in the last month or two, but it's like It's like sub ten percent of the very hardest um uh test cases is my understanding. So we do we we we do some work like that. We also, you know, we have machine labeling of interactions.

29:48

So, you know, c can we use an AI to look at the AI interactions and saying, Okay, it looks like there might have been an error here. The AI it said at some set point it said, Oh, my bad. Hell you got it right, when I you know, wrong or things like that. But that but then we also do human labeling because the uh AI isn't

30:04

the AI labeling isn't perfect. And so it's that human labeling work. And we did about 2,000 sample conversations. We did this about six months ago. That that helped us get a much, you know, when these numbers I'm telling you like two, three percent of errors it's based on that human labeling. We're also doing some labeling on I would not just errors but

30:23

the productivity of the conversation. Like what percentage of conversations are healthy and the student is engage with it versus what percentage are the students just saying, I don't know, I don't know, I don't know.

30:34

Have you able to train a model to evaluate that or is it all all human labeled today of like what's appropriate?

30:39

That's all human la labeled. In theory we might be able to train a model to to evaluate in fact we we should be able to over time. But yeah, we're doing human labeling. It's also it would be interesting'cause I think in a in a traditional classroom you'd probably see similar ratios of the number of kids who are just like checked out and saying, I don't know, I don't know, they're not even engaged. But um anyway, we're we're trying to label all of the above.

31:00

Do you have like an early you know, any early thoughts on uh you know, obviously I feel like people are studying the impact of these models in lots of different um, you know, uh contexts. And I feel like in the workplace there's been some studies that are like actually it's like uh for

⁠¶ Global Impact and Future Prospects

31:11

folks that maybe were in the bottom half of the workforce that are like super impactful and you know, I think it's everyone's still trying to figure this out. Like any early inklings on like, you know, at least to date where, you know, for the types of students or where these models have been like uh or these products have been most impactful? Or still too early to tell.

31:26

On the student side I think it's It's um I I think for st for students who are curious and are already engaged, I this is their dream come true. I mean they can a ask any question, et cetera. So I I you know, but there's there's the efficacy. We're starting to run some and we are seeing some correlation with engagement. um and potentially learning. Uh, but I I think the models have to get much more

31:55

part of the learning experience than just being there to answer questions before we start seeing real movement there. Yeah.

32:01

Makes uh makes total sense. I mean obviously you tal you talked about evals and in I imagine in building all of this, you you're one of the larger users of L LMs out there, and so I'm sure there's all sorts of uh things you had to build and and things that were required on the infrastructure side.

32:13

Um, you know, what do you kind of wish from a tooling perspective was uh was available that you guys didn't have to build, uh that ideally you could just go get from uh from somewhere else to make these things, you know, more uh easier to use or kind of like easier to operate at the scale you guys operate?

32:26

Yeah, I mean especially the things I've been talking about. Uh you know, some memory uh architecture, um some um robust eval architecture, automated eval architecture, uh would be pretty pretty good. Um Yeah, those are those are the those are the the the two and then you know if there's some architecture for some people who've really like nailed it on on things like math and things. Yeah. Um so that it's it is evaluating well as opposed to just we know it, you know. Yeah.

33:00

You know, obviously it seems like you're making a big shift to these proactive um you know, making the models proactive. Um and and you know, to what extent does that end up being like customizable by an individual teacher versus like, hey, there's just general best practices for we're teaching algebra and so we know the kind of like pedagogy or prompts to do. Like how have you thought about um thought about that?

33:19

Yeah, the teacher in the loop is essential. I like one of the issues is You can you can say, hey, I'm an AI here, ask me any questions. You know, some kids will engage. You can say, Hey, I'm the AI. I show up right when you come to the website. Come ask me some questions. Or I'm here to or you might have this question. And maybe that's a little bit better. But actually the best is, and this is what we're building for next year.

33:41

If the AI notices that a student, let's say it notices Jacob is having trouble with the distributive property. It tells uh Mr. Kahn, the Jacob's teacher, hey, Jacob's having trouble with the distributive property. Click here and you will assign A tutoring session with Conmigo on the distributive property, and and Jacob has to do it by tomorrow night. That um that's what will make it's his assignment. Jacob has to do it.

34:14

Um, so when you have teachers assigning AI tutoring interventions and then the teacher being able to hold the student accountable to that, that's when you're going to start finally seeing um kids engage.

34:26

It's so exciting. I mean I feel like in hearing that, uh obviously we're you know, in in in San Francisco there's all this, you know, uh techno optimism and and you kinda hear that and you're like, God, like it almost feels uh

34:36

you know, uh it almost feels like a disservice to the world that this stuff isn't more widely available. If we have these capabilities today, I mean especially, you know, even if you look at emerging markets where like maybe there isn't as as high quality of education today like Uh I'm sure there's a lot of techno optimists that are like, God, we should just be throwing these models at students like completely today and let them run like

34:54

What are those people missing and I guess, you know, how do you see the next few years playing out like globally uh because when you when you describe that use case, uh it seems it seems unbelievably compelling.

35:05

Look, there are some percentage of of of the of students in the world that can just run with things. There was this uh gentleman Sugatha Mitra who did the school hole in a wall, school in a wall, something like that. And he put like tablets in rural India w th poor kids and

35:22

Yeah, they claim that the kids just started going up to the tablets and started learning, et cetera, et cetera. I think there was some I mean, there's definitely curiosity there. But I th that's limited and yeah, I mean there probably would be some benefit of just giving kids in villages in India uh access to chat GPT or something, they some some subset will probably start making really good use of it.

35:42

But once again, if you don't know what you don't know, I mean you talked about the blank the blank late you you don't know how to even structure your own journey. You wouldn't even know to prompt it to say, Hey, can you work me through

35:52

the Indian national standards in algebra and, you know, you wouldn't even know to do that. So I I do think you've got to structure it more. I think once you and that's what Khan Academy, one of our values is that we've always had that structured content. And there are kids, you know, I I just met a young girl uh freshman at MIT from Afghanistan.

36:09

She couldn't go to school. Khan Academy was her education. She used another platform we have called schoolhouse.world to prove what she knew. MIT accepted that and it and admitted her. So there are people out there who are who are running with the things that already exist.

36:23

But um yeah, I think you just start throwing stuff out on you know, uh you're not going to but I I do think in another year or two we will have stuff that you could um Ideally you do have access to a teacher who knows the algebra or the physics or whatever, but if you don't and you're in a container uh with you know a Starlink access and some thirty, forty, fifty dollar uh tablets that are shared by five or six kids over the course of a day. I actually think you could get pretty far.

36:53

Yeah. Do you think we will see that?

36:55

Yeah, I mean there's already people who are trying it out, but I don't see any reason why we wouldn't see it. Um One of the things that I hope to do in the next year is for uh Khan Academy, let's call it two year. I it it really we need some um funding on the transcripting side to automate the transcripting, but I hope to give high school d high school credits and high school diplomas.

37:16

Um, and I think once you offer high school credits, high school diplomas and eventually college credits and college diplomas. I mean, I don't know when we'll do college diplomas, but college credits for sure. Then all of a sudden if you're this kid in India Um, it becomes very compelling for you to um have access to one of these devices and, you know, at least get a internationally recognized high school diploma um goes a long way.

37:40

I guess I I'd love your commentary on just like the broader AI and education market. Like, you know, what do you what have you kind of seen happening? Um, you know, what are some other spaces or or areas you pay attention to in that? Obviously it seems like you're really going for the heart of it, but curious like what else you've seen or or what else you find interesting going on.

⁠¶ Challenges and Innovations in AI Education

37:55

I you know, there's I think we're just we're there's a lot of noise right now. There's a lot of startups.

38:01

sounds like AI

38:02

What I what we see is a lot of people who've done some a a layer over chat GPT and some prompting and you know, some of them maybe get a little bit of traction, but It's it it is it is actually a hard environment. I mean you're an investor here. It's a it's a In some ways it's a dream that you can make apps so much faster than you could before. Apps of substance. At the same time the switching costs and the barriers to entry are so low.

38:32

Um, I do think some of not to pick on you guys, some of the incentives of the investor community. don't necessarily allow you to take a slightly longer uh point of view. Uh you know, I that's what I tell our team. Like Khan Academy doesn't if if the market fully solved the problems, Khan Academy doesn't need to exist. If we're a nonprofit

38:54

I I I would say in this moment our value is we actually do have permission to take a slightly longer view. We still have a you know, we put a lot of pressure on ourselves to still be very relevant in the short run and compete with anyone, but we will We we have funders who are will give us, you know, a five year grant to think through assessment.

39:12

to think through writing, knowing that it's going to evolve and change. And and honestly some startups don't have that luxury. They just have to, you know, in the next nine months get some product market fit, otherwise they're going to disappear. Um, and I think our other value is the trust, uh that, okay.

39:28

You know, uh uh this people are more sensitive around AI. Okay, are they really not about you know, somehow changing uh undermining teachers? Are they really about the pedagogy or do they really have efficacy studies here? Is this something that I can feel good about? Um, and you know, over the years hopefully we've been building a lot of that trust. So that helps in this AI world.

39:47

Are there any corners of like the education market that that maybe you guys aren't focused on but you're like, Oh, I wish there was startups that were trying to build, you know, uh this area or or or help out with these kind of products?

39:56

I'm sure there are startups that are doing it. Um, I'm not aware of them and none of them have gone mainstream as far as I can tell. But um Oh I mean there's just so much. The the whole like interviewing piece is so uh resource intensive and broken and interviewing is assessment. And it's just it just doesn't feel like assessment because it's so non-standardized and it's so expensive.

40:22

And um, you know, I used to do even before AI, like if I get to a final round with Google and Google tells me, Hey Sal, you're a great guy. It's it was either you or Jacob And we really had to flip a coin. Uh so we gave the job to Jacob. You know, good luck with your life. And

40:41

And then I go to Microsoft and have to start over again. And Microsoft has to start over again and spend all that time. They shouldn't have to. Like the fact that I got all the way to that end, Google should give me some type of certificate saying like Sal is a Really good, you know, and then all Microsoft needs to do is do a final round to make sure that I'm not, you know, that I'm I fit in and

41:01

and that I'm aligned with the job and things like that. So there there's always been huge, huge, huge inefficiencies, you know, just even who gets access to an interview. It's a small subset. Yeah, we just have we have a product manager job out. We get hundreds and hundreds of applicants for the job. We're probably going to be able to interview phone screen maybe twenty, twenty-five people. Um

41:24

I'm sure there's some false negative in there. I'm sure we're gonna miss in that 180, 280 people that we're not interviewing, not even phone screening. I'm sure there was some talent there. But may maybe the best applicant is in that 280 that we don't even call back. So um anyway, I think there's there's interesting things there. I think there's interesting things in just corporate training, development.

41:46

Um, you know, all those those things that we all have to do, these uh, you know, what's appropriate at work type of cybersecurity training or sexual harassment training and all that. I gotta believe a a simulation with an AI would be way more interesting. Like the following has just played out. What do you do next? And

42:04

Totally. Well I mean you you you mentioned obviously this uh you know uh the application of these in the in the corporate world and I I guess You know, maybe zooming out, it I feel like one broader question I'm sure you think a lot about is like the skills that will actually matter just how fast the world's changing, like the skills that will actually matter for students that we're training for like the future workforce as as these models keep getting better, like

42:23

How do you think about that and like the jobs we're we're training people for and to what extent, if at all, like what we teach uh has to has to adjust or or change to adapt to that?

42:31

Yeah, I I'm I'm I believe that a smart, cogent, strong critical thinker is always like that's always really great um baseline skills to have. So I I think kids should continue to learn to write and read and do their math well and, you know, have good n general knowledge of social studies, history, and status civics. But I think the thing that's really going to be I think it

42:59

I think

43:00

Uh economists talk about entrepreneurship as a factor of production, right? This ability to resources that already exist, but put them in new permutations to create value that didn't exist in the world before. And Most people always assume what entrepreneurship is kinda like starting Khan Academy or what you'd invest in. And it is that. Um, but I actually think that's going to be more and more of table stakes in almost any

43:24

Career, especially over this next 20 years when there's just like super rapid change. Uh, the people who have these, they're otherwise got have solid skills, but they're just always saying, Hey. Wow, I just heard about this one thing. Let me try to use that um and put it together with this thing. And wow, if I do this and then I take that output and I put it here and I can make this, I might be able to make a pretty good

43:47

And it might not be the final output, but it might get me eighty percent there. And then I have to use my skills to tweak it to be those people I think are going to be um that's those are the skills. Those are the skills and um

44:01

That's those are who I'm looking to highlight in our in our organization. And uh, you know, organizations that are not doing a lot of that are also going to suffer. If they don't have a lot of that entrepreneurship inside, they're not going to be able to innovate. Their cost structures are going to be way higher than everyone else.

44:15

No, I love that. Um well we always like to end our interviews with a quick fire round where we get your your take on a on a set of questions. Um, you know, maybe to start, uh what's one thing you've changed your mind on with regards to AI in the last year?

⁠¶ Quickfire

44:27

It's obvious in hindsight, but AI on its own. And I I don't think I would have said AI on its own is going to solve all the problems. I wouldn't have said that even two years ago. But I've definitely shifted even more towards a lot of what we talked about, which is like it's more about how does AI really empower the teacher to hold the students accountable and engage students in other productive things.

44:51

What's your favorite way that you use AI today within Khan Academy and then anything on the wish list where you hope that you'll be using it, you know, a year or two from now?

44:58

When I'm preparing for videos, uh to make a video, uh, you know, I one of my I guess you could call it if you want to say m superpowers is I ask all of the dumb questions in my head. that sometimes people are afraid to ask. And I think sometimes teachers glaze over them because they they were afraid to ask them too. But I I'm like, wait, how does this make sense? Even if I'm doing like a fourth grade concept, I'm like, wait,

45:17

Wait, that that doesn't make sense. Okay. And sometimes I can figure it out by myself, but a lot of times I I you know, in the past I would I would do web searches, I'd call people up. I have found that for someone who's asking the right questions, AI can dramatically accelerate um that process. I'm always able to make fun images now for for videos that can visualize uh what I'm trying to show. So that's been valuable.

45:40

45:41

every now and then would have to give I I I tend to most of my speeches I give off the cuff, but um I've had to give a few formal speeches, things like commencement addresses and things. And that's where just doing a a verbal dump and then having the AI transcribe it and then at least doing the first pass at the speech and then tweaking it saves hours. Um I I wanna try doing some uh

46:04

You know, I've I've been talking to some of these people who have these platforms. They say that you can now like prompt the AI and it'll make the whole app and they'll host it for you and everything. I wanna try that out because I've

46:13

Vib coding uh on the on the docket.

46:15

Vive coding, there's Thunkable, which is we know I know the founder pretty well. And he was telling me the other day that um no, you you can make the whole app. And and I even described an app that I wanted to make and he's like, Yep, that would probably work. And I'm like, Wow, okay. So I need to try that out.

46:28

I keep telling my son, my my sixteen year old who's really into programming and making games, like, Why you should be doing this vibe Kobe? He and it's so funny'cause he's like an old man. He's like, No, that's that's not that's that's You know. And I'm like, No, but that's the future of coding, you gotta do a little bit of both He's like, No, I need to learn it properly myself and

46:44

I guess um, you know, any uh what's been like the biggest surprise in building these features? I guess maybe something that you thought would work really well that didn't, or something that you w you weren't expecting much from and has actually been hugely successful in the adoption of of Con Miga? I mean it

46:56

It it it's not surprising in that almost everything turns out this way when you try to do something at scale and enterprise, but yeah, it's a lot more work. Like when you when I first saw these models and I'm like, wow, these are magical. And look I still do worry, um that, you know, will will these models leapfrog the apps that are using them in some way? Or will, you know, the the general apps like the chat GPTs leapfrog the specialized apps. Um

47:25

It's still a concern, but yeah, I mean the amount of work to like make it really w work well for a certain use case is is is a lot. Um things that have been better than expected. I I you know, I was afraid two years ago whether our team was ready for this type of a pivot. And I was a I was ready whether I I was worried whether the education community was ready for this type of a pivot. I actually thought we might get a lot of flack for leaning so hard into AI. But um

47:52

It's you know, it is pretty interesting now. Our our team does view itself as an AI first organization. And it's easy for me to say, but when I really think about ourselves two, three years ago, that was not that was not And the obvious thing, and there was some pushback and there were people who didn't like that direction. And I would also say the education community has moved faster than expected.

48:13

What were the weeks like? I mean, obviously you got the GPD four demo, you did your AP biotests, and then you came back and you're trying to reorient what what was that like first month like post uh post that demo?

48:22

Well, uh almost every day we were talking to the open AI lawyers. We're like, We have five more people we want to get under NDA or ten more people. And we had a little uh We had an on site, I remember that first uh w and and we had a little office and every time we got people we said, Come with me and I wanted to show people'cause it was such a like they would

48:39

They they thought they were getting punked. This was before chat GPT or anything. And I'm like, I have something to show you and they would walk in and just I I just enjoyed doing the demo and they're like, What's going on? What's what is this? What is this? What are you showing me? And I was like, This is the future. But you know, a lot of those people um

48:59

I think initially about half the company was like, This is everything. We have to stop what we're doing. And just and then half of them like, hold on, makes errors, hallucinates, bias, it's gonna freak people out, cheating. And we said, look, we've Both sides are right, but we can't we we've gotta take all those risks and turn them into features.

49:20

Oh, I love that. Well look, this has been a fascinating conversation. I want to make sure I leave the last word to you. Uh anywhere you want to point our listeners uh where they can go to learn more about Khan Academy, the AI work you're doing, uh anything else, the the floor is yours.

49:31

Yeah, you know, people who want to try it out, uh you go to Con Academy, uh, Con Migo, you could try it writing coach if you're, you know, homeschooler teacher. I also I people should know about schoolhouse dot world, which is a sister nonprofit uh that I helped start.

49:44

which is around free tutoring. We do it through volunteers. So if people want to volunteer and give tutoring or get tutoring, we're about to launch something called the Dialogues Initiatives where you're going to have conversations about tough topics, but uh and then rate each other on how well the other side listened. So um yeah, t take take a look. And remember we're nonprofit, so donate.

50:04

Yeah. Awesome. Well thanks so much, Teresa. This is a a fascinating conversation.

50:07

Great, thank you.

50:08

Hey guys, this is Jacob. Just one more thing before you take off. If you enjoyed that conversation, please consider leaving a five-star rating on the show. Doing so helps the podcast reach more listeners and helps us bring on the best guests.

50:19

This has been an episode of Unsupervised Learning, an AI podcast by Red Point Ventures, where we probe the sharpest minds in AI about what's real today, what's going to be real in the future, and what it means for businesses in the world. With the fast-moving pace of AI, We aim to help you deconstruct and understand the most important breakthroughs and see a clearer picture of reality. Thank you for listening and see you next episode.

50:39

🎵 Music

✨ This transcript was generated by Metacast using AI and may contain inaccuracies. Learn more about transcripts.

Ep 63: Khan Academy Founder/CEO on Salman Khan on Classrooms in 20 years, Rolling out to 1.4M Users & Sal’s Hopes for AI Education

Summary

Episode description

Transcript