Strachey Lecture: Computer Agents that Interact Proficiently with People - podcast episode cover

Strachey Lecture: Computer Agents that Interact Proficiently with People

Jun 23, 201741 min
--:--
--:--
Download Metacast podcast app
Listen to this episode in Metacast mobile app
Don't just listen to podcasts. Learn from them with transcripts, summaries, and chapters for every episode. Skim, search, and bookmark insights. Learn more

Episode description

Professor Kraus will show how combining machine learning techniques for human modelling, human behavioural models, formal decision-making and game theory approaches enables agents to interact well with people. Automated agents that interact proficiently with people can be useful in supporting, training or replacing people in complex tasks. The inclusion of people presents novel problems for the design of automated agents’ strategies. People do not necessarily adhere to the optimal, monolithic strategies that can be derived analytically. Their behaviour is affected by a multitude of social and psychological factors.  In this talk I will show how combining machine learning techniques for human modelling, human behavioural models, formal decision-making and game theory approaches enables agents to interact well with people. Applications include intelligent agents.   The Strachey Lectures are generously supported by OxFORD Asset Management.

Transcript

Know. Okay. Good afternoon, everybody. My name is Mike Wooldridge. I'm headed Department of Computer Science and it's my very great pleasure to welcome you to the Trinity Term Strategy Lecture. And in fact, as you will surely have noticed, we have an exciting double bill this afternoon because the Straight G lecture will be followed by after a half hour break by the Lovelace Lecture.

But first the Straight U lecture and it's my enormous pleasure to welcome my colleague Sari Krauss from Bar-Ilan University in Israel. Sara gained her Ph.D. in 1992, one of the earliest PhDs in the area of multi-agent systems. And it was a system that played the game of diplomacy, as I recall, and this system established many of the techniques that went on to define so each career.

Although her most cited work is in the area of non monotonic reasoning so-called claim semantics in my community, she's best known for her work on automated negotiation, automated bargaining, and this is an area which she has really identified and made has as her own.

And over the last two decades, she's been the one person in that community which has really championed that area and pushed it forward very determinedly towards the goal of building agents that can negotiate and bargain proficiently with people, which is what she's going to tell us about this afternoon. She's won a string of awards, including computers and thought awards, which is the most prestigious award that can be given to an AI researcher under the age of 35.

That was now some time ago. Sorry, we both got a bit older since then, but is then gone on to win a number of other awards. She's a Triple H fellow, a fellow of the European Association for A.I. and is a recipient of the ACM Autonomous Agents Research Award in 2007. She's a good friend and a very close colleague. We've published many papers together and I look forward to publishing many more. She suffers, as I do from from the current affliction of being a head of department.

But when we're both through that, we'll both get back to it. So sorry. It's an enormous pleasure to welcome you to give us a straight G lecture. Thank you. Thank you, Mike, for your kind words. And I would like to tell you what I like to do. I like to build agent. I like to build Agent Z, interact proficiently with people.

You know, in this swarm, I probably don't need to say what is agent, but I am looking at autonomous a intelligence systems that the plan is adapted to learn and cooperate with. People can face adversary, but especially I'm interested in these agents that interact proficiently with people. Now, what can this agent do? They can help people. They can replace people. And they can be used to train people. For example, supporting people. We have a project with General Motors.

Well, the agent was a helping and supporting a driver in electronic cars to reduce the use of energy by helping set the air conditioning and the climate control system in intelligent way. We had the system resembling Tampa. That is still since 2007 in the L.A. airport, helping people to decide where to put a random checkpoints in to save, to know, talking and world. We got an award from the Homeland Security of the system. I'm really proud of it. They are really using it.

And then on the other hand, I would like an agent that will whisper in my ear and help me convince my son to live. He's a smartphone and go to do something. But I really like this agent to replace people. For example, I would like to have an agent that will convince me of Asal not to eat the cake, but always something healthy.

Oh, we built four with a Sheba hospital in Israel, a system that replaces a speech therapist in helping people with brain damage to train and improve their speech capabilities. And we can also use our agent to train people. For example, I'll show you in a minute. We built an automated system that can be used for training people that want to get ready for an interview on in the job market. Or we build an agent for the Israeli police to train people in interviewing athletes.

It's especially young personnel to train A to interview a suspect or other people. Let me just show you a short video of this. So it's in Hebrew. Why? Because the Israeli police busted them. But you have subtitles. Mushrooms was a phenomenal fuel with the issue of having mushroom and leaving home. Okay. And so we build the agent that plays the role of the virtual suspect. And his answers depend on the questions that they are young that a police personnel is asking.

And this led to a huge help pain wise in 2020 project when we try to play to train people to collaborate together in interviews while others, you know, the adversaries are getting are collaborating and are we need also the good guys need to train themself how to collaborate together. So these are many project where computer system interact proficiently with people. What is common to all these systems?

So the main issue is that in all this system we need to predict the user behaviour in order to build a good agent. I need to predict whether a human will accept or not accept an offer. I need to know how they, my son, will respond to an argument that I say in the in the discussion I need to predict what will the adversary will attack? I need to predict. Well, how it drivers in the world in another project will react to setting a a car race in a location and settle.

Now, why is it so difficult to predict people? Decision making? Well, I always say because people are upsetting their own experiments. But really because they don't maximise expected utility. Why they don't maximise expected utility, I'm not sure. But probably because they are sensitive to context. They like to know left of their own preferences. They affect by complexity. They'll have problem of self control and sitter. So that made predicting human decision making very difficult.

Know I would prefer to assume that people are following really nice equilibrium strategy, but this is rarely happens, at least in my experience. So what can we do? So we first start with some data data about decision making of people, and they sometimes we have a lot of data, but in many Google are collecting data for us.

But in many cases, we don't have that many exemplars, as you will see, because collecting decision making of how drive is, how they respond to our suggesting how to save energy is more difficult. So in some cases we are using human behaviour models that were developed by social sciences is the way to to shape our models and to find good features in the properties. Then with this we are using some machine learning.

So many times we need to develop our own machine learning algorithm because of the constraints of the problem. So we have a human prediction model. And then when we have a specific human with a specific data, even if it's evolved over time, we can enter it to the model and get some prediction. But this is just the beginning. Having the prediction or a model of the human is just. The prediction is this is just the prediction is just the beginning of the story.

Then the main issue is that we have some. Optimisation problem or some game theory setting to to solve because our agent do does have some problems that it does have an optimisation problem to solve does have a goal that we would like it to maximise expected utility or to satisfy the goal.

So the idea is to integrate the prediction model of the human into some optimisation or some game theory model such that after we solve it, we have some action that the agent should take in the interaction with the human. And usually if you have some nice virtual human around, it really helps the interaction with people.

But this is not our main issues. So for example, if we are thinking about the automated agent negotiate that negotiated with the people, we had this aspect, we have data that we use both to predict whether people will accept an offer or not. And also we use a data because people with using chat we need to recognise what they are saying in the negotiation. We also use a data in machine learning, but then we had some optimisation problem to solve. What will be the best strategy of the negotiation?

So our agent will get the best result from our point of view. And we also use the human behaviour models so such that it will help in some decision of the agent that we can solve by solving an optimisation problem. So if we just want to see our system. So I was offered a job at a role and I am going asked to come in and discuss the terms of my planning. Okay, so let me just show you what is doing.

The logic behind negotiating is based on two values rational theory anchoring and aspiration adaptation theory. At the start of this negotiation session, negotiators present the whole offer. While we don't expect the human counterpart to accept this offer, it does provide an anchor or a basis for further negotiation offers. The agent incrementally focuses on one issue at a time based on which issue is the next most important or aspire to the employers.

Most important value is salary. Negotiators showed flexibility in negotiating for an inspired value salary and tries to find a counteroffer that yields a similar utility value to itself while incorporating the employer's request for salary. Okay. So that, you know, if you the young people, if you need to train for way negotiating about salary, you can log into the system. It's online and try and train yourself. Anyway, so that was one aspect in negotiation.

In the cases I show you say domain was not that complex. We had it well in advance. But in other settings it's not just we need to predict the human behaviour, we need also to predict things about the environment and again we need the model. So we need the model for the human and we need the model for the environment. And again, we have data, we need to collect data on both.

We have the machine learning method to build the models and then it goes into the optimisation of the agent and they the agent takes in actions. So I want to present in this talk two examples that demonstrate the use of this methodology. One is about providing arguments in discussion, and it's with my student. I really wasn't filled. And the story was like this.

People from Intel came to me a few years ago and they wanted to develop an agent that will whispering in the ear of people when they are doing discussions. And they said, you know, you are you have this non-Newtonian reasoning. That was the basic full argumentation theory. Why want to use argumentation theory to build such an agent? And the student was very excited. And he went and out many, many, many papers about beautiful, really nice series about argumentation.

And we read the papers and we said, okay, we'll use this. But before we use it, I told him, you know, I know that in game series it's not games. There is not a good predictive models for people behaviour before we are using argumentation theory for our model, let's check that. That people are following album one of this series of argumentation theory. So what we did, we collected, we took from the literature six fictional cases that appeared in the literature of argumentation theory.

We put this, we collected people from Amazon Turk, and we let them negotiate. And this and you can see it in the, in the format of argumentation theory, you have nodes which are saying, well, is this thing. This appeared. I remind you have the Zen nodes are a the nodes of the argument and they can attack or support a. It's the argument and there's always some a ways in argumentation theory to decide what is a strong and strong argument.

And he is the wife is saying, let's buy SUV and the husband or the other way round. Osborne said, Let's buy a movie. And the wife said, It's too expensive. And so we asked him, what will be the next argument? You will say, so you can say, Well, SUV safe. We said, Yeah, it's too expensive because it's taxes. Or you can say, I'm taking a loan or you can say, Well, it's acting, taking loans. That is high interest. Okay, so what will you say in such a situation?

So we collected what people said, and most of them are choices. They say argument. You know, let's take a loan. And it turned out that any argumentation series that you use will give very low weight to this argument. And as I said, 35% of the people chose this one. But this is just an example, and this is really low regard. This was extension use or numbers or any. So that wasn't that nice. I was very disappointed. The student was extremely disappointed.

So I said, Well, okay. He said, this is just fictional arguments. You know, why won't we go and look at real argumentation? So the student found a real found a database of transcripts of argumentation, people, arguments, people said about various topics. And it shows two topics capital punishment and trial by jury.

And this was a database from 1995. And he checked whether the arguments that people said belong to extension extensions is a concept in argumentation theory of what are good arguments that will be, say, in the series. And it turned out that only less than 45% of the arguments said in the discussion belong to any extension. And that was extremely discouraging because this was really a real discussions.

So yeah. So just to summarise this, we have fictional cases of 142 student people with transcripts of these people, and we have also students that did chat with a 72 people doing a chat and most of them didn't follow argumentation. Sui. So what can be done? Okay, I said, let's use our methodology. What's the methodology said? Let's try to predict what people will say in their argumentation. This will be the first time, first thing you know.

So we collected data from Amazon Turk and also computer science student in Israel, and they did both this and others. And we tried to find features about this argumentation. And one thing that we did, the features that we used were based on argumentation. Issawi Like justification that was. But we also took feature form psychology and also new concepts that we developed, which called relevance. How this argument is close to the previous arguments.

Look, this is a tree. Okay. And then we will able to do quite well. This is us. It depends how many. How many time I saw you. So if I. So, for example, full response of the of the person I quite well know how to predict the fifth time, the fifth argument that you will propose. And these are the other like a random or a. As a going according to the a major. This is Emily and we did quite well. Interestingly enough, people around we did it in two countries.

People from both countries didn't do exactly the same. But I can use the model. Let's say the model I learned in the computer science student and I got a 77% accuracy to the Amazon tech and I got 72% accuracy, which is really nice that it transferable between countries. We did the same thing prediction in the capital punishment and we again got quite good the prediction model. So but as I told you, prediction is just the first step.

How do I use a prediction to make our argument to to suggest to people which argument to say next in the deliberation. So. We said we look at different a possibility one just two or fields in the prediction. You know, I will see what is a best prediction and then I will choose one of them. Well, I was very reluctant to do it because I said, you know, if a person is thinking about an argument, I really like my argument.

But, you know, I don't understand in people I just understanding predictions so. Okay. So we said the concept of relevance will clearly related. I was thinking that we look at the graph, each of them is based on some mathematical models, but I'm just summarising things that are far away from the argument because these are innovative arguments. We have some you always think of prediction proofs relevant. I said, okay, I would say something forms argumentation theory and see if it's left it.

We compare it with not saying anything and with random. So if 204 participants in this experiment and they you can see this is the ones that did the best forms, the acceptance rates. If I give you a suggestion what to say in the discussion, will you accept my proposal? So this is the first, the average you said. So the prediction proofs relevance did the best. And so I do a prediction and I have several arguments that can be said.

I chose according to what events and the and then there was only relevance it was doing well and also the prediction weakly related, which was for my surprise, this divorce. I was thinking if there is an argument that I didn't think of, I would like to hear it as a suggestion, but people didn't like it. And similarly, if we are looking at the satisfaction of the people from the they got again, they like this one.

Okay. So I was quite happy with this approach that we can help people in the discussion, but really what they want. I want an agent that can convince people to do something. As I said, you'll remember the cake and say, okay, so we said we have, we have a methodology. We, we need human argumentative behaviour. I said model for me, argumentation. We build a optimisation problem and let's see how we do it. So we started with the argumentation series. This is a we combine here two models.

One is the most theory and one is also based on weights and and each of them give different possible conclusions. So we chose one of them according to our experience. So this is how the model is looks. So we have a arguments in the notes and we have a attacks in support, but we have also weights on the attacks and support. And we also had weights on the argument if it stayed by itself, that is its critical model and it has some advantage of this way that we build it.

Now, if we want to have it with people, we need to know where these numbers will come from. So. So we took two domains. One was about convincing computer science masters. Computer Science Undergraduate Student two. To do a master degree in Israeli fields, do a master's degrees and a Ph.D. So that was a good day. And also this sick issue, just that, you know, in this experiment, people came to the lab, they said what they want a a a a bar, a healthy bar or a chocolate cake.

And then we tried to convince them the other way around. And when they left, they got one of the things. We brought them to the lab and they got the cake all they had. See? Well, I'm not. So is that healthy? But never mind. So first we collected human dialogues A to build these, say, possible trees or graphs for argumentation theory. And then we also collected question we, we gave people questionnaires about how strong these things attacks, what diseases sing about argumentation.

And we collected a lot of data to build this is. Interestingly enough in this say augmentations it's let them just to collect data. It turns out that they people use 33% of the student used in the argumentation at least two arguments that attacked one another. What can I say? So we have argumentation framework 12 human argumentation, argumentative behaviour.

And then we used also machine learning to predict what some distribution of the arguments that the person will say next given the last day, a is a sequence of arguments that he said so far. So we use this data also for prediction. So we have all this report now. We came to the optimisation problem. Now, this was tough. What do we have? What do we know about these people? We almost don't know anything about them because what is their argumentation framework? We don't know what is their lie.

So we build pom pom the p that the nodes are states of our belief of states as they are argumentation graph of the persons. And we updated it over time. And they because we have observation. The observation is what the argument that the person really said. So this is a good indication about his model of argumentation and that we used all the prediction and probabilities for buildings upon dip. And now given that wave bomb the P, we solved it and we got a well, is it safe to solve it?

Because it was not easy to solve, you know, these huge bomb dips. And we use the Monte Carlo search tree and some smart optimisation problems, solutions to solve the problem. The P and Z gave us the strategy of the agent in our mentation, convincing people. Did it help? Well, we ran this experiment with new people. Of course not. The people that we checked earlier and they are lucky to us.

We were able to show that in the master degree we did as well as the people that try to convince their friends. And actually in the chocolate cake, we were doing better, significantly better that using the baseline, the ones it was following some argumentation theory that was the purpose for agents because no one did with humans didn't do very well. I must say that in general, as you noticed, the percentage of our ability to convince another person to change his mind is extremely difficult.

And they, while we are doing at least as good as people trying to convince people to change their mind is not that easy. Now we have a project where we have sleep people and each try to change a person that doesn't have an opinion. So if I don't care if I have a cake or a a bar, it's easier to convince me to take one of them, but they so will try. So this is the ongoing work and we'll see how we are doing in this case. So that was one example.

I wanted to show you how we develop and use this methodology that we had. I want to say to show another a project. Well, we had also to model the environment. And this has to do with a also with robots. And this is the work of also a Rosenfeld and Lagman and Oleg Maxime of Amazonia. So think about now. Move to another things. Forget about argumentation and discussion. Just think about one person trying to manage some whole boats so it can be a soldier, but it can be a drones.

And they these days with drones usually and these robots, you have one person on one drone or one person that manage if they are doing small things. Even if the drones are autonomous UAV, I don't know if you know they are two people on one UAV these days. And the challenge was, can we put one person on autonomous robots that are cheap? And the it's really problem. If you see this is my lab and the robots are moving around and you should come to my lab and try to manage ten robots.

The robots are looking for health, for green balls. So you say this is autonomous robots. Why do I care? Legit robot move around. But the robot's the problem also. But, you know, backing to some places and sometimes, you know, these robots like to get into the ladies room in my lab and then going out is quite difficult. And while while they can do it, eventually a human operator can easily manoeuvre them out and they are on their way to continue searching for the green balls.

Oh. Sometimes they went out of battery. Sometimes. And in addition, still, there are several decisions that people don't want the robot to make, but want that as a person. When they find the green birds, they want it to get a confirmation that they say this is really a green ball and not just a box, a green box or something. So the problem is that the resistant robot's moving around and the person that is trying to operate them is getting overwhelmed.

And given that, the question is, can we have an agent that will support the operator and will help the robot's ends operate to get better results? In our case, find more green balls. So this is just that. You see the robots. These are the robots are. And I was out looking for this kid involved in my lab. And they are very cheap. They cost less than 1500. The. And this is the interface of the. Operator. Okay. We'll go back to these sailboats in a minute.

There would be another movie. So engine design. We need a model for the humans. Who is a woman? He's the operator. We need the model for the robots because he's the environment. We don't know how we will behave. So we collected data on the human behaviour. We collected data on the robot performance. This the idea was that it's really difficult to collect data when the robots are moving around, running out of battery and.

So the interesting thing here is the machine learning point of view was that we collected data from the simulation and then we deployed it on real robots. So that was quite nice. And then we have an optimisation problem to solve, to set and solve, to give the best advice to the operator. What should we do next? That was the advice we gave him, and this turned out to be extremely useful.

So we have 150 hours of simulation. And so human operators that came to the lab and then after we have the agent, we ran sleep experiments. We in the in the simulations. But the most interesting experiment was with bringing real people to the lab twice in a week in between. Once they managed the robots by themself, once they met the robot with the help of the agent. Have did this first have the Zaza sync first just to be fair to all of them and they. Uh, agent. Really? Hips are on average.

Zay Roberts found seven balls when they operate. Operator didn't have the help of Zay. Of the agent and 14 balls when it did have the help of Z a operator as the agents operated the help of the. And it's not just that it was an average would it help all the people that came to the labs, the agent help all of them? And I must tell you, you know, I'm not z good. The manoeuvring robots, but we have also quite good operators and you can try it. It's really helping. Are people doing it?

So let me show you another video demonstrating why I have this. I must tell you why we have this video. Because my student, they wanted to do a video to send to each guy competition on a video for robots. And I said, come on, you know, this has nothing to do with research. But they were very excited. I said, okay, I will give you some money, but don't come to me for anything about this stupid video. And they were in first place. So what can I say?

This past week, there's a student missing in this section of the campus. Even the replication unit is largely. In disaster and violence. Both the victims and the rescuers lives may be in danger. Robot technology can provide solutions. Today's robots are semi-autonomous and require a human offering to help whenever they cannot handle the situation on their own. Agent three away from. Agent four moved. Agent four Away from agent and UAV drive.

Robot six to a fast paced simultaneous streams of data from multiple robots can be overwhelming and can be a gruelling task for a single operator. The robots are too close together and spread them around. The robots are too close together and spread them around and usually drive robot snakes to a better position to agents. Here in Maryland University, we have developed an intelligent agent that supports the human multi robot team collaboration.

The agent prioritises the different class structures. The robot ignores and generates learning functionality, offering situational awareness, thereby allowing you to make better decisions. The agent uses a novel methodology combining healing and optimisation, which accounts for both the operator's ability and robust performance in real time through extensive empirical evaluations.

We have shown that our aim was able to significantly enhance the team's performance, which can translate into being loved, says some of the guys. I'll send you an application out. Just a shout. Okay. So we actually use this approach in another domain, which is a storage domain where you have a robot that brings the items to the operators that package them and heals the person, needs bots to pack and to manage the robots. And this is extremely difficult.

We had the simulation about robot shelf boxes packing and the transit agent was helping the operator. And again, we did a we adhere to algorithms with solving the optimisation problem. And you can see that we are people are doing extremely better when they have the agent is advisor is us advising what to do. So if I will summarise we have automated agent can help people, can replace people and they can be used to train people. I demonstrate a few of these, a project that we did.

It's a difficult task to build this agent and we need a lot of things. We need machine learning algorithms. We need optimisation. We need to understand interaction and use game theory models. But putting it all together. I'm having fun. Thank you.

Transcript source: Provided by creator in RSS feed: download file
For the best experience, listen in Metacast app for iOS or Android