Teaching Robots How to Do Everything

Speaker 1

00:15

Pushkin. In a metaphorical sense, AI is everywhere. It can write essays, it can do your texes, it can design drugs, it can make movies. But in a literal sense, AI is not everywhere. You know, a large language model can tell you whatever twenty seven ways to fold your shirts and put them in the drawer, but there's no robot that you can buy that can actually fold your shirts

00:44

and put them in the drawer. At some point, though maybe at some point in the not that distant future, there will be a robot that can use AI to learn how to fold your shirts and put them in the drawer, or you know, cook lasagna, pack boxes, plug in cables. In other words, there will be a robot that can use AI to learn how to do basically anything. I'm Jacob Goldstein and this is What's Your Problem, the show where I talk to people who are trying to

01:16

make technological progress. My guest today is Chelsea Finn. She's a professor at Stanford and the co founder of a company called Physical Intelligence aka PI. Chelsea's problem is this, can you build an AI model that will bring AI to robots, or, as she puts it, we're.

Speaker 2

01:35

Trying to develop a model that can control any robot to do any task anywhere.

Speaker 1

01:41

Physical Intelligence was founded just last year, but the company has already raised over four hundred million dollars. Investors include Jeff Bezos and OpenAI. The company has raised so much money in part because what they're trying to do is so hard. Motor skills, the ability to move and find ways to fold the shirt to plug in a cable, they feel simple to us, easy, basic, But Chelsea told me basic motor skills are in fact wildly complex.

Speaker 2

02:11

All of the motor control that we do with our body, with their hands, with our legs, our feet, a lot of it we don't think about when we do it. It actually is incredibly complicated what we do. This is actually like a really really hard problem to develop in aisystems into robots, despite it being so simple. And the reasons for that are because actually it is inherently very complex, and second that we don't have tons and tons of data of doing this, in part because it's so basic to humans as well.

Speaker 1

02:42

Right, let's talk about the data side, because that seems like really the story, right, the big challenge, and it's particularly interesting in the context of large language models and computer vision which really seem to have emerged in a weird way as a consequence of the Internet. Right, just because we happen to have this crazy amount of data of words and pictures on the Internet, we were able to train language models and computer vision models. But we

03:12

don't have that for robots, right. There is no data set of training data for robots, which is like the big challenge for you and for robotics in general.

Speaker 2

03:22

It seems, Yeah, so we don't have an open internet of how to control motors to do like even really basic things. Maybe the closest thing we have is we have videos of people doing things, and perhaps that could be useful. But at the same time, if I watch like videos of like Roger Federer or playing tennis, you can't just become an amazing tennis player as a result of that. And likewise, just with videos of people doing things, it's very hard to actually extract the motor control behind that.

03:48

And so that lack of data, that scarcity of data, makes it in some ways a very different problem than in language and computer vision. And I think that we should still learn a lot of things from language computer vision and collect large data sets like that. It opens up new new challenges new possibilities on that front, and I think that in the long run we should be to get large amounts of data, just like how in autonomous driving we have lots of data of cars driving

04:14

around very effectively. Robots too, could be in the world collecting data learning about how to pick up mustard and put it on a hot dog fund, or learning how to open a cabinet to put some objects away. We can get that sort of data, but it's not given to us for free.

Speaker 1

04:33

You still have this core problem, which is there is no giant trove of physical reality data that you can train your model on. Right, That's the great big challenge, it seems, what do you do about that? How do you start to approach that?

Speaker 2

04:49

Yeah, so we're starting off by collecting data through telling operation where you are people are controlling the robot to do tasks, and then you don't just get video data. You get the videos alongside what are the actions or the motor commands needed to actually accomplish those tasks. We've collected data in our own office. We've also collected data in homes across San Francisco, and we also have a

05:15

very modest warehouse. In some ways, it actually like our current operation is rather small, given that we're a little over a year old at this point.

Speaker 1

05:24

Like what's actually happening? Like if I went into your warehouse and somebody was doing teleoperation, what would I see? What would it look like?

Speaker 2

05:30

Yeah, so we it's a little bit like controlling a puppet. So the person who's operating at the robot, they are holding in some ways a set of robot arms, but they're very lightweight robot arms, and we use those to measure the positions of joints.

Speaker 1

05:47

It's almost like an elaborate control for a video game or something. It's like that, it's not actually a robot arm, right, It's a thing you control to sort of play the robot to the robot move.

Speaker 2

05:57

Yeah, exactly exactly, and then we record that and then directly translate those controls over to the robot. We have some robots that are just robot arms, where you're only just controlling the robot arm. It's mounted to a table

06:09

or something like that. But we also have what we call mobile manipulators that have wheels and robot arms, and you can control both how the robot drives around as well as how the arms move and we're doing tasks like wiping down counters, folding laundry, putting dishes into dishwashers, plugging cables into data center racks, assembling cardboard boxes, lots and lots of different tasks that might be useful for robots to do, and recording all the data. So we

06:38

have cameras on the robots. There are sensors on the joints on the motors of the robots as well, and we record that in like a synchronized way across time.

Speaker 1

06:47

So when you do it, it's like kind of like a real world video game, like you're moving your arms in these things, and in basically real time, the robot arm is moving and picking up the thing you wanted to pick up, And like, what's it like? Is there like a curve where like at the beginning it's really bad? Sort of tell me talk me through an instance.

Speaker 2

07:06

And it depends on the person. So some people can pay it really really quickly. Some people are a bit slower to pick it up. I've pride myself in being a pretty good operator, and so I have done tasks as complex as peeling a hard boiled egg with the robot, which is how are.

Speaker 1

07:22

You how are you at peeling a hardboard hard boiled egg with your hands.

Speaker 2

07:27

It's pretty hard with my own hands too, yeah, and with the robot is even harder.

Speaker 1

07:31

Tell me about the robot peeling a hard build egg because that sounds like a hard one. Yeah.

Speaker 2

07:35

So the robots, basically, all the robots that we're using are like kind of pincher grippers. They're called parallel drag rippers, where there's just one degree random like open clothes two pincers.

Speaker 1

07:44

It's basically two pincers, like two.

Speaker 2

07:46

Pinters, two arms. Yeah, exactly, and I've used that exact setup. There's six different joints on the arm, so it can move as kind of full basically full range of motion in three D space and three D rotation, and you can use that to peel a hard boiled egg. You don't have any tactile feedback, so you can't actually feel the egg, and that's actually one of the things that makes it more difficult. But you can actually you can

08:08

use visual feedback to compensate for that. And so just by looking at the egg myself, I'm able to figure out if you're like in contact with something, and you just.

Speaker 1

08:18

Use one prong of the claw like what I could say, you squeeze it a little to crack it, and then use like one prong of the claw to get the shell off.

Speaker 2

08:26

Yeah, exactly, so you can. You want to crack it initially and then hold it with one gripper and then use basically one of the two fingers in the gripper to get pieces of shell off. When we did this, we heart boiled only two eggs and the moss egg. This is actually a Stanford The first egg and graduate student ended up breaking and so that I did the second egg, and I was able to successfully not break

08:49

it and fully peel it. It took some patience, certainly, and I wasn't able to do it as quickly as with my own hands, But I guess goes to show the extent to which we're able to control robots to do pretty complicated things.

Speaker 1

09:02

Yeah, and so obviously, I mean that is a stunt or a game or something fun to do with the robot. But presumably in that instance, as in the other instances of folding clothes and vacuuming it like, there is learning, right. The idea is that you do it some number of times and then the robot can do it, and then presumably there's also generalization. But just to start with learning, like you know, reductively, how many times do you got to do it for the robot to learn it?

Speaker 2

09:31

Yeah, so it really depends on the extent to which you want the robot to handle different conditions. So in some of our research, we've been able to show the robot how to do something like thirty times or fifty times, and just with that maybe sounds like a bit, but you can do that in like typically less than an hour if it's a simple task, and from that the robot can under the circumstances. You only kind of demonstrate it.

09:56

In a narrow set of circumstances, like a single environment, a single particular object, the robot can learn just from like less than hour of data.

Speaker 1

10:05

What is an example of a thing that the robot learned in less than an er of data?

Speaker 2

10:09

Oh yeah, we put a shoe on a foot, We tear it off a piece of tape and put it on a box. We've also hung up a shirt on a hangar.

Speaker 1

10:19

So that's not that much I mean, especially because you say the robot, but what you really mean is the model. So every robot, right, presumably or every robot that's built more or less like that one, right, Like that's one of the key things. It's like you're not teaching one robot, you're teaching every robot ever, because it's it's software fundamentally, it's an am model. It's not hardware.

Speaker 2

10:39

Yeah, yes, with the caveat that, if you want to be this data efficient, it works best if it's like in the same like the same color of the table, the same kind of rough initial conditions of where the objects are starting, right, and the same shirt for example. So this is just with like a single shirt and not like any shirt.

Speaker 1

10:55

So there's there's like concentric circles of generalizability, right, like exact same shirt, exact same spot, exact same table versus like fold a shirt versus fold clothes, right and versus. And so is that just infinitely harder, Like how does that work? That's your big that's your big challenge at some level, right, Yeah.

Speaker 2

11:16

So generalization is one of the big one of the big challenges, not the only one, but it's one of

11:20

the big challenges. And in some ways, I mean the first unlock there is just to make sure that you're collecting data not just for one shirt, but collecting it for lots of shirts, or collecting it for lots of clothing items, and ideally also collecting data with lots of tables with different textures, and also like not just visual like appearances, but also like if you're folding on a surface that has very low friction, like it's very smooth, versus a surface that like maybe on top of carpet

11:46

or something that's going to behave differently when you're trying to move the shirt across the table. So having variability in the scenarios in which the robot is experiencing in the data set is important, and we've seen evidence that you set things up correctly and collect data under lots of scenarios, you can actually generalize to completely new scenarios.

12:08

And in like Pile five release, for example, we found that if we collected data in roughly like one hundred different rooms, then the robot is able to do some tasks in rooms that it's never been in before.

Speaker 1

12:23

So you mentioned Pile five, So PI zero point five that's your latest model that you've released, right, tell me about that, Like, what what does that model allow robots to do? Like what robots and what settings and what tasks.

Speaker 2

12:39

Yeah, yeah, definitely. So we were focusing on generalization. So the previous model, we were focusing on capability, and we did a really complicated task of laundry folding. From there, we wanted to answer, like, Okay, that model worked in one environment. It's fairly brittle. If you put it in a new environment, it wouldn't work. And we wanted to see if we put robots in new environments with new objects, new lighting conditions, new furniture, can the robot be successful.

13:03

And to do that, we collected data on these manipulators, which feels like a terrible name, but robots with two arms and wheels that can drive around kind of like a humanoid, but we're using wheels instead of legs, a bit more practical in that regard, and we train the robot to do things like tidying a bed, or wiping spills off of a surface, or putting dishes into a sink, or putting away items into drawers, taking items of clothing, dirty clothing off the floor and putting them into a

13:36

laundry basket, things like that, And then we tested whether or not after collecting data like that and lots of environments aggregated with other data, including data on the internet. Can the robot then do those things in a home that has never been in before. And in some ways that sounds kind of basic, like people have no problem with if you can do it something in like one home,

14:01

probably could do the same thing in another home. It's not really doesn't seem like a complicated thing for humans, but for robots that are trained on data, if they're only trained on in one place there are whole universe, is that one place they haven't ever seen any other place? This is actually kind of a big challenge for existing methods.

14:17

And yeah, it was a step four. We were able to see that it definitely isn't perfect by any means, and that kind of comes to another challenge, which is reliability. But we're able to see the robot do things in homes it's never been in before, where we set it up, ask it to do things, and it does some things that are useful.

Speaker 1

14:33

So like in the classical setting where a robot is changed in one room, like it doesn't even know that room is a room. That's just like the whole world to the robot, is that world right? And if you put it in another room, it's in a completely unfamiliar world exactly.

Speaker 2

14:48

And so for example, what we were talking about, like hanging up a shirt, its whole world was like that one, like like a black tabletop that smooth, that one blue shirt, that one coat hanger. And it doesn't know about this entire universe of other shirts and other.

Speaker 1

15:01

It doesn't know that there is a category called shirt. It only knows.

Speaker 2

15:04

Yeah, it doesn't even know what shirts are.

Speaker 1

15:06

Yeah, it doesn't even know what shirts are. For pie zero point five, Like, what did you ask the robot to do? And how well did it work?

Speaker 2

15:13

Yeah, So we trained the model. We took actually a pre trading language model with also like a vision component, and we fine tuned it on a lot of data, including data from different homes across San Francisco, but actually a lot of other data too. So actually only two percent of the data was on these like mobile robots with arms. So we can store how the motors were all moving in all of our previous data and then train the model to mimic that data that we've stored.

Speaker 1

15:40

It's like it's like predicting the next word, but instead of predicting the next word, it's like predicting the next movement. Or something like yes, exactly.

Speaker 2

15:48

We've kind of trained it to predict next actions or next motor commands instead of next words. We do an additional training process to have it focus on and be good at the mobile robot data and homes. Then we set up the robot in a new home and we give it language commands, so we can give it low level language commands, or we can actually all so give it higher level commands. So the highest level of command

16:12

might be cleaned the bedroom. And one of the things that we've also been thinking about more recently is can you give it a more detailed description of how you want it to clean the bedroom? But we're not quite there yet, So we could say clean the bedroom. We'd also tell it put the dirty clothes in the laundry basket, so that would be kind of a subtask. Or we can tell it like commands like pick up the shirt,

16:32

put the shirt in the laundry basket. Then after we tell it that command, then it will go off and follow that command and actually in most cases realize that command successfully in the real world.

Speaker 1

16:47

How did it do.

Speaker 2

16:48

So it depends on the task. The average success rate was around eighty percent, so definitely room for improvement, and in many snares it was able to be quite successful.

16:58

We also saw some some failure modes where for example, if you're trying to put dishes into a sink, sometimes one of the dishes was a cutting board, and picking up a cutting board is actually pretty tricky for the robot because you either need to slide it to the edge of the counter and then grasp it or somehow get the kind of get the finger underneath the cutting board. And so sometimes it was able to do that successfully.

17:20

Sometimes it struggled and got stuck. The exciting thing though, was that it was able to We were able to kind of drop it in place as it had never been before. And I was doing things that are quite reasonable.

Speaker 1

17:32

So what are you doing now, Like, what's the next thing you're trying to get to? Yeah?

Speaker 2

17:35

Absolutely, So the next thing we're focusing on is reliability and speed. So I mentioned like around eighty percent for these tasks. How do we get that to ninety nine percent? And I think that if we can get the reliability up, that's kind of, in my mind, the main missing ingredient before we can like really have these being like useful in real world scenarios.

Speaker 1

18:00

So getting to ninety nine percent is interesting. I mean, I think of self driving cars right where it seemed sometime go I don't know, ten years ago, fifteen years ago, like they were almost there, and I know they're more almost there now. I know in San Francisco there really are self driving cars, but they're still very much at the margin of cars in the world, right, And it does seem like almost there means different things in different settings, But I don't know. Is it super hard to get

18:31

from eighty percent to ninety nine percent? Does the self driving car example teach us anything for your work?

Speaker 2

18:39

The self driving car analogy is pretty good. I do think that fortunately, we may not need There are scenarios where we may not need it to be quite as reliable as cars. Cars there is a much much higher safety risk. It's much easier to hurt people, and in robots there are safety risks because you are in the physical world. But it's easier to put in software precautions in place and even hardware precautions in place to prevent that as well, So that makes it a little bit easier.

Speaker 1

19:08

I mean, nine percent probably isn't good enough for cars, right, They probably need more nines than that, whereas it may well be good enough for a house.

Speaker 2

19:16

Cleaning robots, yeah, in certain circumstances. And yeah, like we're also thinking about scenarios where maybe even less than that is fine. And if we view humans and robots working together, it's more about kind of helping the person complete the task faster or complete the task more effectively. So I think there might be scenarios like that, but still we need the performance and reliability to be higher for the robots to be faster in order to accomplish that.

Speaker 1

19:44

We'll be back in just a minute. What do you imagine as the initial real world use cases?

Speaker 2

20:05

I don't know. There's a lot of examples of robotics companies that have a tempted to kind of start with an application and hone in on that, and I think the lesson from watching those companies is that you end up then spending a lot of time on the problems of that specific application and less on developing the sort of generalist systems that we think in the long run

20:28

will be more effective. And so we're very focused on understanding what are the core bottlenecks and the core missing pieces for developing these generalist models, and we think that if we had picked an application now, we would kind of lose sight of that bigger problem because we need to solve things that are specific to that application. So we're very focused on what we think are like the core technological challenges. We have certain tasks that we're working on.

20:53

Some of them have been home cleaning tasks. We've also have some more kind of industrial light tasks as well, just to instantiate and actually be iterating on robots and applications could range from things and homes to things in workplaces to industrial settings. There's lots and lots of use cases for intelligent robots and intelligent kind of physical machines.

Speaker 1

21:19

What are some of the industrial tasks you've been working on.

Speaker 2

21:24

One example that I mentioned before is inserting cables. There's lots of use cases in data centers, for example, where that's a challenging task. Another example is constructing cardboard boxes and filling them with items. We've also done some packaging tasks highly relevant to lots of different kind of shipping operations.

21:44

And then even folding clothes. It seems like a very home task, but it turns out that there are companies that need to fold like very large lots of clothing, and so that's also something that in the long term could be used in larger scale settings.

Speaker 1

22:01

So I've read that you have open sourced your model weights and given designs of robots to hardware companies, and I'm interested in that and that set of decisions, right, that set of sort of strategic decisions. Tell me about that sort of giving away IP basically.

Speaker 2

22:20

Right, yeah, yeah, definitely. So this is a really hard problem, especially this longer term problem of developing a general system. We think that the field is very young, and there's like a couple of reasons. One is that we think that the field needs to mature, and we think that having more people being kind of competent with using robots and using this kind of technology will be beneficial in the long term for the company, and by open sourcing things,

22:47

we make it easier for people to do that. And then the second thing is, like the models that we develop right now, they're very early, and the models that we'll be developing one to three years from now are going to be far far more capable than the ones that we have now. And so it's kind of like like equivalent to like open eye open sourcing GPT to GPT three. They actually didn't open source GPT three, but like, I think that they would still be in an excellent spot today if they had.

Speaker 1

23:19

Like what could go wrong that would either prevent you as a company from succeeding or even hold back the field In general, I don't think we.

Speaker 2

23:28

Entirely know the scale of data that we need for getting really capable models. And there's a little bit of a chicken and egg problem where it's a lot easier to collect data once you have a really good model. It took like large amounts of data.

Speaker 1

23:43

Right, Or if there were thousands of robots out of the world running your model, they would just make an incredible amount of data coming into you every day, right.

Speaker 2

23:50

Yeah, yeah, exactly. So that's that's one thing I actually less maybe less a little bit less concerned about that myself. And then I think the other thing is just that there are technological challenges to getting these things to work really well. I think that I think we've had incredible progress over the last year and two months, the last like fourteen months. I think since we've started, probably more progress than I was expecting, honestly compared to when we

24:17

started the company. I think it's like wild that we were able to get a robot to like unload and fold laundry like a ten minute long task.

Speaker 1

24:25

And folding laundry is like a famously hard robot problem, right, Like it's the one that people in robotics talk about when they talk about things people think are easy are actually hard for robots, right.

Speaker 2

24:37

Yeah, absolutely absolutely. I mean you have to deal with all sorts of variability and how clothes can be crumpled on each other. And also it's like there's even like really small, minor things you need to do in order to like actually get it to be flat on the table and folded nicely and even stacked. And as the task gets longer as well, there are more opportunities to

24:55

make mistakes, more opportunities to get stuck. And so if you're doing a task it takes ten minutes, in those ten minutes, there's many many times where the robot can make a mistake that it can't recover from or just get stuck or something like that. And so being able to do such a task starts to kind of point at the resilience that these models can have by recovering

25:13

from those mystics. Uh huh, so when we were first trying to fold laundry, like, one of the common failure modes is that it would fold the laundry like very well by my standards at the time, I would be very very happy with the robot, and then it would push the entire stack of laundry onto the ground.

Speaker 1

25:32

Sort of like teaching a toddler to fold clothes.

Speaker 2

25:36

Yeah, yeah, exactly.

Speaker 1

25:37

Was there a particular moment when you saw a robot using your model full close for ten minutes and it worked.

Speaker 2

25:46

Yeah. First off, we started with just folding a shirt starting flat on the table. We got that to work pretty quickly that it turns out to be pretty easy, and I wasn't too surprised by that. And then we moved from that to starting it in like just a random ball, like some sort of crumpled position on the table, and then you have to flatten and then fold it, and that makes a problem dramatically harder because of all the variability having to figure out how to flatten it.

26:11

We were kind of stuck on that problem for at least a couple of months, where everything we're trying, the success rate of the robot was zero percent. It wasn't able to really make progress on it, and we started to see signs of life I think in August or September of last year, where we tried a new recipe where we were continue to train the model on a curated part of the data that was following a consistent strategy, and that sort of high quality post training is what

26:43

really seemed to make the model work better. And then the moment that I was most excited about was the first time that I saw the model flatten and fold and stack five items in a row.

Speaker 1

26:54

Yeah.

Speaker 2

26:54

I just remember going home that night and being like so excited. It seemed like we had just like figured out this this big missing puzzle piece.

Speaker 1

27:02

So I was asking you why might it not work or what might slow the field down? And then we talked about the happy short story. But if in five years things didn't progress as quickly as you thought, what might have happened.

Speaker 2

27:16

I mentioned that I think that incorporating practice, like allowing the we're about to practice the task, should be really helpful for allowing robots to get better. We don't know what exactly that recipe will look like, and so it's like a research problem, and with any sort of research problem, you don't know exactly how hard the solution is going to be, and I think that there are some other more nuanced unknowns as well that are somewhat similar to that.

27:41

And we have a large number of very talented researchers on our team because we think that there are some of these unsolved breakthroughs that are going to be needed to really truly solve this problem.

Speaker 1

27:54

So, if it does work well and things progress in that universe, what would you be worried about?

Speaker 2

28:06

Good question? I mean, if things work well, I shouldn't be too worried. In general. I do think that it's very easy in general to underestimate the challenges around actually deploying and disseminating technology that takes time, and when the technology doesn't exist yet, that means that like the world is not in a place that is like ready for that technology. I think that there's a lot of unknowns there.

Speaker 1

28:29

I mean, one of the striking things to me about, say, language models, is the people who know the most about them seem to be the most worried about them, which is generally not the case. I think historically with technology, right the possible exception of the atomic bomb, and so I'm curious. I mean those kinds of worries, like do you share them? Are there worries you have about developing a foundation model for robots about bad actors using it?

Speaker 2

28:57

Even I do think that, like, yeah, there's plenty of technology that has dual uses, and I think there are applications of technologies that are harmful. I think that a lot of the concerns in the language model community stem from imviewing these systems with greater autonomy. And I think that I work like hands on with the robots quite a bit, and I don't see a world in which

29:32

they will be taking over in any way. It's very easy to just like, well, with our current iteration of robots, to just like if we threw some water on it, the robot wouldn't be in trouble.

Speaker 1

29:42

So that might be a problem for you, but I'm sure you could solve that way we're working.

Speaker 2

29:48

We're working on so we actually do have a new iteration that that is actually a lot more waterproof. But it's just not a concern that I show.

Speaker 1

29:54

Okay, interesting basically just because you think we can whatever turn it off if we need to.

Speaker 2

30:01

Yeah, and yeah, and I think, yeah, there's always going to be dual use concerns, but I think that the pros of the technology outweigh outway some of the Jobson's.

Speaker 1

30:09

Well, give me the happy story, then, like in what what number of years should we choose for a happy story? Ten is ten too soon?

Speaker 2

30:16

I don't want to put a number to it. I think that they with research, you don't know exactly how thongs things will take. And I an envision a world where the when you're developing hardware, it's it's not too hard to actually teach it to do something, and teach it to do something useful, rather than just having machines that are not particularly intelligent, like dishwashers and laundry machines and so forth.

Speaker 1

30:45

Go bigger if you would like what like what what what would be pill be teached robots to do in that world, I.

Speaker 2

30:53

Guess if we were to go bigger, I think that there's a lot of challenges around helping helping people as the age allowing them to be more independent. That that's like a huge one. I think that I don't know, manufacturing, there's all sorts of places where like there's abuse of labor practices and we can maybe like be able to eliminate those if it's a robot instead of a human. Yeah, many, many,

31:15

many examples. And I think that there's also even things that are even hard to imagine because the technology doesn't exist. So a lot of the things that I'm thinking about are robots helping humans in different circumstances to allow them to be more productive. But once something exists, like you often, like people are creative and come up with new ways of how that's used.

Speaker 1

31:37

We'll be back in a minute with the lightning round. Great, let's finish with the lightning round. What's one thing that working with robots has caused you to appreciate about the human body?

Speaker 2

32:00

Our skin is pretty amazing.

Speaker 1

32:02

Huh. Well, so we didn't talk about I mean a sense of touch, or of of heat or of cold, right, I mean presumably the models you're building, the robots you're using don't have that, but they could, right, they could have a sense of touch. Is anyone working on that? Is that of interest to you?

Speaker 2

32:22

Lots of people working on it. I think it's pretty interesting. I think that the hardware technology is not super mature compared to where I'd like for it to be in

32:30

terms of how robust it is. And the cheapness and the resolution that said, Like, we actually put cameras on the risks of our robot to help it get some sort of tactile and for example, if you can, if you like visually look at your finger as you make contact with an object, you can see it to form around that object, and you can actually just by looking at your finger get some notion of tactile feedback similar

32:55

to what our skin gets. Yeah, and cameras are cheap, really easy, robust, way more robust and cheap than existing technology for tactile something.

Speaker 1

33:04

I've heard you say that humanoid robots are overrated, and I'm curious, why do you think that.

Speaker 2

33:11

I think that simplicity is really helpful and important when trying to develop technology. When you introduce more complexity than's needed, it slows you down a lot. And I think that the complexity that humanoids introduce. Yeah, I think that if all of the robots we were working with were humanoids, I think that we wouldn't have made anywhere near the progress that we've made because we'd be dealing with additional challenges.

33:35

I also think that optimizing for ease of data collection is really important in a world where we need data, and it's a lot harder to collect and operate all of the different joints and motors of a humanoid than it is to control a simpler robot.

Speaker 1

33:52

Do you anthropomorphize robots?

Speaker 2

33:55

I hate it when people are anthrough morphize robots. I think that it is misleading because the failure modes that robots have are very different from the failure modes that people have, and it misleads people into thinking that it's going to behave in the way that people behave.

Speaker 1

34:12

Like like in what way?

Speaker 2

34:14

Oh like, if you see a robot doing something like doing a backflip, like or even folding laundry, you kind of assume that anything like like if you saw a person do that, then they probably could do a lot of other things too. And if you have to promorphize the robot, then you assume that it, like the capabilities that you see are representative as if it were like a human ah, and that it could do a backflip anywhere, or that it could fold laundry anywhere with any item of clothing.

Speaker 1

34:39

Or surely you would think a robot that could do a backflip could fold a shirt, but no.

Speaker 2

34:45

Exactly exactly, so sometimes it's fun to like assign emotions to some of the things, or say the robots having a bad day, because certainly it feels like that sometime. But when it kind of moves beyond fun and jokes, it might have consequences that I don't think makes sense.

Speaker 1

35:02

I read that there was a researcher who said they would retire if a robot tied to shoela Yes, and then one of your robots tied to shoelace, and I guess they didn't retire. But I'm curious. What would you need to see a robot do to retire.

Speaker 2

35:23

Hmm, I don't know. I guess one example that I've given before that I would love to see a robot do. I don't think this is quite retirement level, but being able to go into a kitchen that has never been in before and make a bowl of cereal pretty basic, especially compared to doing a backflip. I cannot do a backflip myself, but I could make a bowl of cereal.

35:44

But it requires being able to find objects in the environment, being able to interact with delicate objects like a cereal box, maybe even use tools in order to open the cereal box. Pouring liquids. Yeah, so that's a task that I love, and I could actually even see us being able to show a demo of that without too much difficulty actually if we put our mind to it and in collected data for it. So it actually is, I think, or within reach than maybe I imagined a few years ago.

Speaker 1

36:12

Just as you're thinking about it, it's getting closer. You're like, oh, wait, we could do that.

Speaker 2

36:18

Yeah. I mean we've actually collected data of pouring cereal, like opening a cereal box and pouring it into a bowl. We haven't yet done liquid handling and pouring, but I think we're actually going to do it this week. On the Robot, I asked the hardware team to make a waterproof robot. So we're not too far. A lot of the pieces are coming together. I also, I love working with robots and so, and I'm also fairly young, I think not too old, and so I don't imagine myself retiring anytime soon.

Speaker 1

36:53

Chelsea Finn is a Stanford professor and the co founder of Physical Intelligence. You can email us at problem at pushkin dot fm, and please do email us. I read all the emails. Today's show was produced by Gabriel Hunter Chang, edited by Alexander Garreton and engineered by Sarah Bruguerrett. I'm Jacob Goldstein and we'll be back next week with another episode of What's Your Pop

Transcript source: Provided by creator in RSS feed: download file

Episode description

Transcript