¶ Intro / Opening
Hello and welcome to the Behavioral Design Podcast. This season we're diving into the intersection of behavioral science and AI. We want to make sense of the state of AI, from understanding how humans interact with intelligent systems to using AI to do behavioral design itself. I'm Aline Holsworth, a health tech advisor specializing in AI and product design. Over the past 15 years, I've been crafting human centered products with behavioral science
at the core. At Apple, I LED Behavioral Science for Health AI, designing and launching AI powered features to help users reach their health goals. And I'm Samuel Sultzer, your second Co host. I'm a behavioral strategist specializing in hybrid formation and designing products that drive long term baby change. I work with leading tech organizations integrating AI to scale behavioral design for good.
And I'm also the founder of Baby Bites, a dedicated community on behavioral science and AI. Quick word on Nuance Behavior where we help organizations build impactful digital products using behavioral design. We only take on a few clients at a time to ensure the highest level of quality for our tailored evidence based solutions. If you'd like to become one of our special projects, e-mail us at [email protected] or we could call directly on our
website, nuancebehavior.com. Hey, Eileen. Hi. Hi. Hey, I think we have start this episode with something a bit unfortunate where we have to make a, it's called redaction redact, something previously said. Oh, no, that I said or you said. It's unfortunate something that I said and we don't like to do this because it means that something like has been wrongfully said in the past that
it's like inaccurate. But I think it's worth to kind of clear the record and and make sure that we're hold accountable to what we say, right? So in a past episode earlier
¶ Introduction and Correction
this year, I said that I was optimistic and happy. I don't want to take that back. That was, I mean that was clearly wrong from the get go. For being honest, yeah. I know. And well, to be fair, like the things that I said I was happy about, I am actually kind of happy about still in terms like the work and and so on. But but the world has been a little bit of a bummer and it's been really hard, I think to yeah, you know, you just exists
in in some ways. So I do want to say that I have become a bit measured around some of those feelings I expressed before. Yes, you. You've joined me in my cave of sadness. Yes, so at least we're together in the same cave. At least we're together on opposite ends of the world. Maybe someday we'll meet in real life, Sam. Maybe someday, maybe someday we actually have never. So that's that's actually the wildest thing about this. But really? Wild. I love thinking. About that, yeah.
But maybe this year, maybe this year is gonna actually now when when we're talking about this, I'm actually increasing again. Like now I'm getting a little more optimistic and happier just with this potential that maybe this year we can actually meet for the first time. Yeah. Is there anything else you want to talk about or you can say that will distract me from the world that's going on or or make
me more happy in some way? So this episode was really fun to record with Eric Heckler and a lot of interesting things to think about in terms of of when we use AI, when it's really the best tool or approach for a task and when a different approach might be more well suited. And this is something that, you know, I like to think about in my daily life. And I know that you, Sam, you too have a lot of experience trying every single AI tool out there for all kinds of tasks,
right? Not just in behavioral design, but in, you know, automating all sorts of workflows and, you know, some helping generate lit reviews and so on, doing all of the tasks both within behavioral science, but also just, you know, part of being a human. So I know that you've reflected a bit on this and I'm curious, when you think about when it is and when it isn't the best tool, are there some obstacles that you've encountered that where you've thought, hmm, AI is just
not very good at this? Yeah, that's a really interesting question. I think in some ways the really interesting experience I've had is kind of building and updating my intuition around what tasks are worth automating in some ways. So you mentioned that in automation and I, I've been doing that for a long time and increasing obviously with AI as well. And I think the truth is that there's so much more complexity in even the seemingly simple tasks.
I still do a lot of cool stuff that I'm really excited about, but I do think definitely that like, we underestimate how even the simplest of tasks can become really complex when you start to break them down. That's right.
¶ The Efficiency Trap of AI
I think there's a little bit of an efficiency trap and we can owe this a bit. The hype around, you know, how people talk about how you're going to be working at 2X10X as soon as you implement this tool or whatever. And I think part of that is because when you press play and, and you know that the machine is running and you get your output like that maybe takes, you know, 10 seconds, but or one second,
whatever. But all of the building of systems and the structuring and the gathering and then all the checking, making sure like, oh, is there a hallucination here? I'm verifying your results like that, is not that that counts too. Yeah. And I think you call this a trap and actually had a conversation earlier today with someone and they spoke this around writing where they have been writing a lot for a long time. They've written several books and and so on.
And they have found themselves because they, they know how good AI is at writing and how useful a list for writing. They've kind of suddenly noticed that they have shifted towards relying more on AI to write. And often times like it could technically save time. But she noticed that basically she spent around the same time for writing an article, but instead of doing the fun part of actually like the writing part that she enjoyed, she found herself like mostly doing
editing. And she hated doing the editing because like she, she gives the direction and writes something and then she has to edit it. And so basically, she's spending approximately the same amount of time to write an article, but she just shifted her time to the thing that she least enjoy. Stories. Yeah, you're spending much more of your time being frustrated because you're like, this isn't good. Make it good. Come on. More concise, more tight. Exactly. Please write this in my writing
style. Gosh, I'm constantly telling Chad GBT, can you write this more like me please? And that's like a losing battle often times. It can be really tricky and I think it can be a trap for a lot of people, but I think it exists on spectrum, right?
And I think that's the, I think the beauty of this today is where we're starting to maybe add some granularity to interacting with AI, where we see it as kind of like not something we do or don't, but it's more like, when does it make sense to do it and in what way?
Right, or, or how much, how much of A partner are you, if we're going to say like somewhere on the end of like, you know, fully human, no AI, just a, you know, manual human brain to the other end of the spectrum where it's it's only the robot, like where's the sweet spot where you're using it?
¶ Human-AI Collaboration
Yeah, I think like a lot of it comes into to what degree you're accepting the fact that AI can be really good collaborator and and useful. But what I like to talk about this human AI sandwich a lot. The idea that, you know, the best thing you can do with AI is to create a beautiful sandwich with it. Meaning that you're not asking AI to do everything, but you're doing the first, you're adding a human layer on the 1st 5 to 10% of the work.
Then, you know, again, it can be challenging even if you have a lot of material for it to be guided by, but like oftentimes it can do that kind of middle part relatively well, especially if it's a very clear task, whether it's writing an e-mail or whatever you want it to do, you can oftentimes do the job really well. And then there's a human on the
other side. I love this one, and I think that, to shamelessly extend the metaphor even a bit more, you could say that the toothpick that is placed through the sandwich, that's like the human orchestrator, right? You're like the conductor who's making sure that everything falls into place and it's working as intended. Wow, amazing. I love it. And when we talk about AI adoption, AI adoption is kind of
a hot topic right now, right? And, and I think a lot of it, you know, sort of goes into all of the historical means of getting people to adopt A new technology. It's unfamiliar. And, you know, we, we sort of really could go to the basics. How do we, you know, get people in smoothly? How do we make sure that the technology is actually adding value to the person who's using it and so on. But you're right, like part of it is just familiarity and getting used to it and seeing
how does it work. You have to kind of get over that first initial bump of using a new thing, just any new thing, right? Yeah. And except that I, I think 1 concept that I think is really useful to understand the study of the the jagged frontier to something that came out of the research paper by Ethan Malik and I think a few other co-authors from Wharton.
And basically the whole idea is that as AI is evolving, it is evolving in a very paradoxical nature where it's at the same time super smart and super stupid at the same time. So you ask it one thing, it blows your mind. You ask another thing, you feel like this is so underwhelming. You know, you ask it to play tic tac toe and it doesn't really know how to do it. You ask it to explain to you some metaphysics thing and it's like really good.
Yeah, it's like my 4 year old can play tic tac toe. I mean he cheats but he knows how it works. That's part of the game I think that's part of. The game, That's right. So I think also like understanding the jagged frontier element to AI that as we're interacting with it, as we're engaging, that sometimes that's to be expected that it's going to sometimes surprise us in positive and sometimes disappointing us as well.
¶ Conversation with Eric Hekler
Yeah. And these pros and cons in particular, as well as a bunch of others that I'm not going to go into yet because they're in the episode, these really came out in my conversation with Eric Heckler. Yeah, tell us about the episode. I'm really interested to hear. What did you and Eric cover? So one thing that he brought up that I really like as a sort of lens for thinking through this question of what's the right tool. And in particular, when designing for behavior change,
he has this mantra. People are different, context matters and things change. And he kind of used this as like a guiding light to frame his approach and the questions around it. And, you know, really sort of like force his hand in One Direction or another. Is it accomplishing the goals that he set out to accomplish? And in a way that's, you know, efficient not only in the way that we were talking about efficiency, but also
computationally efficient. Does it make sense for the resources that you have, the population that you're dealing with, and so on. And he's used all kinds of tools like micro randomized trials. This is something that we talked a lot about with Susan Murphy in a previous episode. But then he also compares machine learning and control
systems. So you'll see, if you're not familiar with control systems, we really get into this approach and you know when it might be more or less useful than something like an AI based approach. Cool. And so I think it's time to get into it. I'd love to introduce you to Eric Heckler. He's a professor at UC San Diego, and he works at this intersection between public health, health psychology, design, and, of course, control systems engineering. So here's Eric.
Having some murgatroyd. Wow. Eric, welcome. Thank you, excited to be here. Awesome. So as you know, this season of the podcast is all about examining this intersection of behavioral science in AI. And so naturally, a lot of the guests that have come on the show have talked about where they've used AI to accomplish some sort of objective. You are sort of unique in that you've tried to accomplish the same objective. So helping people move more, for example, using a variety of strategies.
So sometimes using AIAIML, sometimes not. And so I think you're really special because you've put a lot of thought into when AI might be the best tool or partner and when it's not. And so I want to kind of work up to this more general question of when to AI and when not to AI. But I think it would be really nice to start by just talking about a specific research project where we can kind of see this very question up close. And so if you could, I'd love to
¶ Just-in-Time Adaptive Interventions
dive into your recently completed work with Jungwan Park when this is the dissertation that he just finished about optimizing just in time adaptive interventions or the Jedi or Jedi that we have previously talked about with Susan Murphy on another episode. So if you wouldn't mind, I'd love it if you could start by just describing the goal of this project. Yeah, thank you. So this was a project in partnership with also Daniel Rivera and Patrick Lachnya. We were multiple P is on this
NIH funded project. A high level goal was there's an assumption that I tend to make in my work, which is basically people are different, context matters, things change. And anyone who's heard me here has probably heard me say those words before. So what we're trying to really do with this work is really try to figure out, OK, then how do we, if that's the case, how can we still help people, you know, and in particular, if people are different, context matters,
things change. Is there any sort of predictability in that? And so that's really what we were trying to do with this just in time adaptive intervention.
¶ System Identification Experiment
So what we did is we developed what's called a system identification experiment. And that's basically, you know, it's very much like a micro randomization trial, which was discussed previously with Susan. But it's basically like what we're doing is we still do experimental manipulation. So we basically offer things and don't offer things, or we have things that we experimentally manipulate the randomization and
otherwise, right? The key difference though, is that a system identification experiment, which grows out of the field of control systems engineering, the goal is not to study if quote, the intervention works. That's more of a statistical idea. We can do that, by the way, we can actually literally the exact same experiment can be treated as either thing, right? But what we did with a system ID experiment is we're actually trying to produce models, predictive models that help us
to basically describe people. So coming back to the people are different, context matters, things change. Our goal is to build models that were built up for each individual, so everyone has their own model and to try to discover, you know, what are repeatable patterns to help us to know when, where and for whom to give people different types
of support. And so ultimately the ultimate goal of this is to be able basically when we have those models within this way of looking at it from a control systems engineering perspective, is those models allow us to basically sort of have this little like simulation environment where we can basically like experiment with a simulation space on all the different types of decisions.
A future possible just in time adaptive intervention could offer, you know, like we'll send you a notification to go for a walk. We might give you a higher step goal the next day or lower step goal or otherwise, right. And it, the controller basically uses this model, the run simulations on what a person might do in different contexts and different situations and from that make a decision.
And and so that ultimately is the goal of this project was actually just to try to build, to see if we could build models that were appropriate for each individual that sort of discover the just in time states, or you might also call it the teachable moment on exactly when, where and for whom any given intervention strategy would be useful for one person or a not
over time. And with the goal that once we had that, we could then basically incorporate that into a control system that could be scaled and run with lots of people where everyone would be getting their own personalized and perpetually adaptive intervention.
So for our behavioral scientists who may be more accustomed to randomized trials where they have, you know, real human participants every step of the way, at what point do you move from having a real person engaging with the intervention to having simulations and, and how do these sort of play with each other or, or feed into each other? Yeah, that's a great question. In brief, humans are always involved in our loop. So maybe I'll jump forward.
There's actually another trial that we're running right now, which is probably much more normative for our behavioral scientists colleagues. We have another trial that's funded by the National Cancer Institute and this one is a randomized control trial where we're basically studying if this overall kind of control systems approach to developing a personalized and perpetually adaptive intervention is working or not. And so first, just terms of like
RCT logic and language. This intervention that we're running is basically a, we're framing it as a personalized and perpetually adopting intervention. It's basically adjusting daily step goals on any given day. It also has this thing where inspired by my colleague MC Schraefel, we have this thing called an experiment in a box is the language she used. It's basically like self study,
self experimentation, right? But so basically there's a whole bunch of packages and features for this, right? And then the comparator for this is a, we've modeled it after a worksite Wellness program, actually Qualcomm, a large company in San Diego. That's basically our comparator condition, right. So 386 people, we're doing this at scale. We already recruited all of our people. We're actually you know, we've already gone through about 240 or so at the time of this
recording so far. So this is just to demonstrate that we can do this at scale with a lot of people and what not. So to come back to your question like, OK, what's the difference between this like and how does randomization work? And the simulation, the basic idea, we called it a cot, a control optimization trial. And it's really a way to build a personalized and perpetually adapting intervention. And so the way it works is Phase 1 is what I was just describing to you.
It's called that system identification experiment. And So what we're doing is to put it into more kind of common behavioral science language or terms, you could think about it as basically we're trying to figure out the right tailoring variables for each person, right? And we're, and so this is how we do personalization. You know, we gather data and we're like, oh, for you, when you're stressed, you walk more. For this person, when they're stressed, they actually walk less.
For this person, stress doesn't matter at all. It might be day of the week or something like that. And so the first phase of the study is basically to try to discover what's true for you, each and every single human being. And then once we do that, the next stage is we turn on that controller. And what the controller is doing is it's basically every single day. It's running little Megro experiments. Right. So every day it's running simulations, but then every day
it makes a decision. I'm going to give you a 9000 step goal tomorrow, the person X, right, Or I'm going to give you 500 points, which translates into, you know, money, Amazon gift cards, right? I think it was about 2500 points was $5 or something like that. So that happens every single day, right? So every single day the basically the, the controller is making simulations, it then makes a prediction and then every single day it tests its
prediction on the person, right? Because every day it's like I gave you a step. I'm guessing if I give you a step of 9000 steps a day, you're going to walk 8500. Of course I'm anthropomorphizing a controller, right? We all do. Yeah, yeah, it's math, right? It's basically saying I'm running this little simulation to figure out a decision. I'm going to then offer that and I'm going to see what happens. And then the controller is smart enough to be able to take that into account and make
adjustments. So it's like, if its predictions are really bad, it makes adjustments, says, OK, well that's not the right thing for you, and it makes adjustments to basically, again, This is why it's perpetually adapting to the person's needs every single day. Has its own functional counterfactual built into the controller. That's how it works. Yeah, awesome. But the person is not a new person. Every single day there there is some learning, right?
So people change, but they don't change that much. Interesting you should ask that. I mean, yes, I that how much do people change, right? That's that's an empirical question. And that's an empirical question that we can and did study. Coming back to Jung Wan's
dissertation study, right? It's like, and that that's the stuff that really is at the core of kind of our work and philosophy is really just how can we actually build useful predictive models that we can then incorporate to basically make better decisions such that we can give people this really
healthy personalized experience. You know, like when I think about just in time adaptive interventions, and this was critical to our experimental design with Jung Wan, we tend to think about 3 concepts all linked to the specific intervention strategies to make sure I tag that one of them is need, 2nd is opportunity, and then the third is receptivity. And so the basic question is for any given intervention strategy, the first question is, does somebody need it?
Right? Like if you're already meeting your step goals, why the hell should I send you a notification to get you to walk more? That would just be rude, right? OK, let's say you have a need. Then the next question is, OK, if I send you this notification right now, would you actually have the opportunity to go for a walk? Would you actually benefit from getting this? The answer is no. Why would we send this to you? Because again, we're just being annoying to you.
And then the final one is, OK, let's say you have a need. Let's say you have an opportunity to go for a walk. The last question is, would you actually appreciate getting the support? That's what we mean by being receptive, you know? And so you can imagine someone's in need an opportunity, but then they get this notification. They're like, dude, leave me alone, I don't like you, right?
And then you get this really bad relationship with the technology to continue to anthropomorphize the technology a bit and everything goes bad, right? And so we want to try to figure out what's going on for each person with that. And how do we take data that we can infer from data, signals and otherwise to sort of figure out just those right moments when you would actually want to get support from this technology in the exact type of support we're trying to provide to you?
And you call these the three just in time states. Can you say a little bit more about how you measure these, how hard they are to measure? The one of them strikes me as much harder to measure than the other two, but yeah, how? How do you do all of this? Yeah. So coming back to the experimental design, we actually experimentally varied this using algorithms and we set up algorithms that we could do a priori. And you know, it's actually not that complex, right?
So with need, how did we define that? I mean, the, the clear advantage we have is we had wearables, right? So everyone had Fitbits versus A3 smartwatch, right? And so we always had our sort of ground truth of how much, how much steps are you as a person engaging in. So that is what we built our need algorithm on, you know, and So what it was is basically have you met your step goal or are you on progress to making your step goal? Yeah, easy peasy. Yeah, OK. Check right.
Opportunity, this one was more complex. This is coming back to more of what I would, you know, usually tends to be in the computer science and AI world. As you said, machine learning. We basically developed a machine learning algorithm that we published previously where again, we took someone's step data.
And what we did is we use, you know, basic ML strategies, random force and whatnot to, to basically detect and build a probabilistic estimate of the likelihood that someone would be able to go for a walk in the next three hour period or not. And the basic algorithm was basically, if this algorithm were producing an estimate that either you were definitely not going to go for a walk or you're definitely going to go for a
walk. Either one of those were basically saying this isn't a moment of opportunity for this intervention because if you're definitely not, you know, we're going to leave you alone. If you're definitely going to, you're already going to do it, so why send it to you? So that was the second algorithm. And this is based on historical data. Have you ever gone for a walk during this time period? Is it just based on time or what other variables are included in
this measure of opportunity? Yeah, great question. Given that or we want to be able to experimentally vary this, we want to keep it as simple as possible. So all we used was basically time, you know, and so we really we're looking for you tend to walk on in the mornings on weekdays or mornings on afternoon or anything. It was all three hour buckets, right? And it was basically the probability that you know how often you generally walk at this time period within a given week
structure. Which may actually be a safer bet than it sounds, given that people's schedules tend to remain fairly, fairly standardized over time. So if you go to work, you probably, you know, go around the same time everyday. You probably drop the kids off at school around the same time and have lunch around the same time, and so on. So even if you're not accounting for the other variables, it's maybe one of the less risky things to assume. Exactly, that's what we are
working on from. But even then, even if we assume it, we experimentally varied it. So even if that was wrong, I'll come back to that. But we didn't always use it. The last one that we use receptivity, this was based on actually the Heart Steps trial that Susan mentioned in the previous one. I was a member of that study
theme and what not. And so we use the Heart Steps data, the first Heart Steps data and what we found when we ran secondary data analysis was basically if we sent more than two notifications, people would ignore them. In Heart Steps, we sent up to five in a given day. And so anytime we got to like 3 or above like. Nobody wants. That no notifications, we're done. We're done with this. So we're like, OK, that's our budget, right? That's basically how we define receptivity.
We basically set up, we actually, we ended up putting it as a 72 hour window. But the basic idea is we set a budget on how many notifications we could send. And if we were under the budget, we were receptive. If we were over the budget, what we did is we looked back into the last time we sent a notification and if you responded favorably to it, we're like, OK, you seem to still want something, so we'll send you another notification. So that was the third one, receptivity.
I see. OK, you can imagine getting really fancy with this variable, right? Trying to understand someone's cognitive States and. So fancy. Yeah. And that's, you know, that's all the exploratory work that we're now engaging in. You know, we have a lot of different types of analysis and you know, we're using control systems approaches, we're using computer science approaches, as you alluded to, we're using lots of different methods. We're pretty agnostic on that on a particular style.
And honestly, just like the philosophy of a just in time adaptive intervention, that's my philosophy of research in general. I'm not saying which tool is like, you know, the universally good tool. You know, there's no sort of platonic ideal that I'm going for. I'm going for when, where and for whom should I use this tool? And what's gonna get the job done? What's gonna get the job done to
actually help people, exactly? So for people who may be less familiar with control systems versus machine learning, can you
¶ Control Systems vs. Machine Learning
give the very, very highest level contrast between the two? Share a very high level contrast. So the field of control systems engineering is ubiquitous. It's in a lot of things you engage in. So the simplest 1 is the thermostat on your house. That's a control system. But then it's also artificial pancreas systems. It is the, you know, autopilots on planes or otherwise or a boat that pacemakers, you know, those are all versions of control systems.
What is control systems engineering and control theory? It's basically an approach to try to make decisions in complex dynamic environments. You know, so like a really simple one to think about is let's go to one of those like going on a boat, right? Like say you want to go straight on a boat if you're on water, right? So if you keep going, if you just keep your bow going straight, you're not going to go straight, you're the water's going to push you aside, right? Dynamic environment.
So what does a controller do? The one of the most simple and basic forms of a controller is called APID Controller Proportional Integrative derivative. You don't need to know that. A simple way to think about this is basically it pays attention to 1 variable in this case of the boat, your compass, right? So I say you want to go 180°,
right? And what it's basically doing is it's paying attention to the past, that's the P part, the present, that's the I part, and the future, that's the D part. Technically, it's the anticipated rate of change. OK, So what is it doing? It has this one variable and you're basically saying I want to go 180°. The boat's going along and having all the response. It starts to move and it's basically paying attention to when the, you know, the tiller is trying to make an adjustment
to keep the boat going straight. How much is the boat getting flipped around in this environment, right? It basically can take into account if it's a really WAVY day or really flat or otherwise paying attention to all the stuff that's happened in the past. Coming to the president is always paying attention to, hey, I want to go 180°. Am I currently one 75178, right?
How far off am I to this? And then the last part, the D part is saying, OK, since what is the basics I know about physics for this one has an example about like water, right? And so it's making a guess like if you've ever been on a boat, if you turn the boat too much, you start getting this weird nasty oscillatory pattern where you just start going back and forth and back and forth as you keep over and compensating. That's what the last part is trying to minimize basically.
OK, so the whole field of control systems engineering starts with some basic philosophy like that and then it gets more and more complex. You know, the one that we have, we're basically using model based approaches where we're trying to figure out that like how to understand the past, present and future to make decisions with all of our models and what not. But so that's basically control systems. Let's go to the computer science
world of machine learning. So what's the basics of machine learning? Machine learning, I mean, there's many variations of this, but it tends to be built on at its foundation and variations of Bayes theorem.
And what is Bayes theorem? It's building out conditional logic and conditional estimates of, of, you know, likelihoods of different estimates, you know, and so with that you can get very simple, you know, if then logic and whatnot, you then get into how you might be able to turn that into a some type of algorithm. This is what reinforcement learnings are, which was inspired by our work as
behavioral scientists, right? Literally that is where reinforcement learning the language came from, right? It was they, they even call it a reward as the success criteria because of us. So yay, good job us, right? And we really don't get the props that we deserve there, do we? Seriously, I think that's exactly right, right? But then you get even more
complex one, right? Like you get into a baser neural networks, right, and deep learning, and then you get into generative AI. And what is generative AI? These are all variations of basically just trying to build out a probabilistic space of conditional logic to basically make estimates on which one of these potentialities is likely true, right? So even when you get all the way up to the complex of generative AI and you have these large language models, it's still the same basic idea.
Basically, I'm going to look at the totality of evidence. I'm going to make a guess on what the next thing is. Next word, next very next word, odd infinatum. Until you get some really cool, interesting things talking to you, yeah. And it's funny because when you're on the basically the consumer side, right, when you're interacting with the, the user interface of ChatGPT or or whatever, you, it does not feel at all like, oh, like this is
math. You have a completely different experience as just a person interacting with the machine, but you don't really see the core of what's actually going on. Thank you for that. That was the very beautiful description. And now can you tell me like why, given all of the options of, you know, what you could explore as a means of increasing physical activity in a population? You know, we use this as an example all the time.
This is what you actually did. Why did you feel that Control Systems Engineering may be, you know, better suited as a methodology rather than using AIML? Methods so coming back to it it's not AI think there's a time and place for each so the the question I would restate it slightly is when do you use one or the other you know, because I think you can and should use both of them just it depends on
the point. So the key advantage that we see in using control systems engineering is that we can bring in prior domain knowledge. And So what it basically does is it allows us to radically simplify what we need to learn and this happened very concretely in Jung Wan's dissertation So just to be transparent, if anyone wants to read his dissertation, I would invite you to look into them.
We replicated and basically tried to build the algorithms we're figuring out just in time states using both the control systems. Like actually technically it's more of a Bayesian modeling, but it was kind of more inspired by our system identification approach versus more like have a more data-driven model free orientation, which comes out like more of the machine learning kind of tradition logic
that it's describing, right? So the question then comes, what was the prior domain knowledge that we incorporated into our system identification experiment? What we incorporated was this first and foremost, it was basically those algorithms I just described, right? So, So what we actually did is we experimentally varied basically turning on and off those variations of, you know, we're going to send you a notification.
And do we either use our little algorithm to to basically say that you have a need or not, you know, paying attention to your goals? Do we use our little opportunity algorithm to pay attention to it or not? Do we use our receptivity or not? And then we also had took into account based on a lot of prior work suggesting that time of day and day of week also matters. Those became other key conditions that we paid attention to. OK.
And So what we basically did was we experimentally varied, you know that the the need opportunity receptivity. We also of course experimentally varied like that when we sent notifications or not. So sometimes we just sent them fully random. So straight up micro randomization trial was basically embedded in our trial as actually our control condition in some sense for our algorithms. OK. And then and you can think about it basically like our random
decision policy, right? It's basically comparing a random decision policy versus an algorithmic decision policy, right? And then what we also had was we varied goals. So sometimes we send high goals and logos. Technically we use a pseudo random signal design. And this is, we don't need to get into the depths of this, but it's, it's, you can use the thing called frequency domains and a whole bunch of things.
But basically it allows you to actually experimentally vary and sometimes send people high goals and logos. And you can set it up such that it's orthogonal, meaning it's, you know, it's, it's experimentally distinct from any of the other experiments. So that's, we did all that in our, in our system identification experiments. And then the key thing that Jung Wan did was we actually then turned this into hypotheses on different types of just in time
states. OK, So we ended up actually with 16 different types of conditions, OK, where and I won't go into all of them, but basically it comes down to variations of it was we either had an algorithm that took into account need, opportunity, receptivity, all of them or not need or opportunity only need or receptivity or random. So those are the sort of four experimental things that we varied across time with every individual over 270 days with 48 people who took part in our trial.
And then we also then had we knew, you know, time of day and day of week that turned into 16 different types of conditions, right. So it might be morning when we give when we use the need opportunity receptivity algorithm or afternoon when we use just the random algorithm and so forth and so on. So with that, we had these 16
conditions that were common. And then what we did is we then could find what happens when we send notifications in those states versus not send notifications nudging people to go for a walk. So what did we learn from this? We basically learned first and foremost, if we ran this experiment using traditional methods, right?
Like so using thinking about this as like a factorial screening experiment, which means that we would have actually run these analysis nomothetically or meaning using population based statistics, you know, so frequent statistics and otherwise the stuff that we traditionally use. If we had done that, we would have only identified the sort of teachable just in time states for 18% of our sample. Yeah, not great. Not that great.
I'm not, I'm not impressed. And that was what we expected honestly, a priori, again from our assumption that people are different. Context matters. Things change when we actually basically built these models for each individual and, and we, you know, built this experimental space for manipulation and all that. Long story short, we could actually discover it for 91% of our sample. And most critically, the folks that we weren't helping one of them, they basically were always
active. They actually had really high steps. Like, I don't even know why they signed up for our trial because like they were on average meeting, you know, always exceeding the step goal in. Terms of they really wanted that Amazon gift certificate.
Yeah, exactly. So, and then even when we actually did exploratory work, when we actually separated the need opportunity receptivity and sort of like simulated what if we had actually done that experimental Long story short, everyone except that one person who is basically always active, we could find their reputably find a teachable moment for
them. And most critically, what this would mean is we could figure out for each person over a 270 day periods, the times when we should reliably only send notifications to them where they would actually benefit from them, which would then mean we would know all of the times that we should leave them alone, right? So think about all the issues of like notification to fatigue and all the signals that we get with this, right? Like we need to be able to have that kind of knowledge and
information to discover that. OK, so this is the prior domain knowledge that we brought in that allowed us to kind of see and discover that. And all of that, again was set up a priori. You can even look into our trial designs to see, you know, this wasn't an exploratory analysis. This was what we were thinking and opening expecting to see with our work. Now, in contrast, what would
¶ Challenges with Classical Machine Learning
have happened when we ran it in a in a more classical machine learning approach and why? Why will it? Why will we run into problems? The short summary is we didn't. We couldn't get nearly as good of a response. There were some people in some context that allowed us to learn some little bits of things, you know, but most of the time it was not possible to discover these just in time states using classic machine learning lessons. The question is why? Why?
Tell me why. Tell me why, Because the main reason is because when you're using machine learning algorithms, you are starting with no prior domain knowledge. And So what this means is you need to literally just or everything about the person. And let's just get this down to like the basics of the math, right? When we go back to our little comparative condition where we're basically saying, no, no, we're experimentally varying
these 16 just in time states. So these and and we're basically have things where we're trying it all in these same states. We're functionally saying we don't need to boil the ocean. We don't need to learn about every possible state and every condition. We just need to learn what's happening here now for these people, because this is the place where we want to make decisions, right? In contrast, you can take the exact same data set and we did
and run this. But now think about it, if you're actually trying to discover this from a purely data-driven machine learning approach, what you're going to be getting into is you're going to have to start to incorporate two way interactions, three-way interactions, four way five, you know, many, many layers of interaction terms to do this. And So what you're doing is you're basically just like overwhelming the degree to which you can, you know, potentially
discover something. Now of course there are that's an oversimplification, but you can, I hope you can get the seed of the problem right that machine learning handles that type of issue quite well with all of its different types of sub algorithms and assumptions it's making on sort of data structure, right? But even then there's all this stuff needs to be fully discoverable by the technology
itself. And so you're not bringing that prior domain knowledge into the foray to help you to basically just really have targeted focus thinks on this is actually what I want to learn. This is what I need. And this is how I can take my behavioral science knowledge and expertise to focus the learning so I can do it quicker and more efficiently. And most critically, this is really, really important, particularly now that I know that we're running a larger trial with 386 people and we
needed to scale this. What that means is computational efficiency becomes radically important when you want to start scaling this right? And so the machine learning stuff, I mean, you can hear it like generative AI requires a ridiculous amount of computational energy. You can brute force and get some answers. Are you going to be able to do that? When it comes down to trying to build something that's appropriate for each person at each time, maybe.
But man, that's going to take a lot of computational energy. The stuff that we're doing is we really can create highly efficient algorithms where we can learn a lot with a very little mid amount of information. And then most critically, coming back to the way the control system works every single day it's running a, it's making a prediction. It's, it basically has its own version of hypothesis that it's testing.
And then as it tests its hypothesis that's gets incorporated into the model so they can get better and better at making decisions. And so it creates a radical amount of computational efficiency for making decisions. Basically, that's why we thought it would make sense here and now in this space. I want to ask you one more thing before we kind of zoom out and talk about AI or not in as in the more general terms. So this research study that
¶ Translating Research to Real-World Applications
you're describing all in a research context, right? I want to think about what, what is the next step when you're thinking about translating this into the real world. So imagine that you are Apple, you own the Apple Watch and you want to be able to like really implement this in a, you know, across every person who wants this sort of health coach in your pocket. Is is how you describe it. How would you then use these findings to kind of make that translation? Yeah, great question.
And that's always has been our core focus, right. So coming back to this, like we're always trying to think about the decisions that can actually work in a real world context, you know. So I hope you can notice I've never actually brought up using ecological momentary assessment as an example, right? Like none of our algorithms are using anything that actually requires something active technically. By the way, we did gather that data. We're exploring it.
We're seeing if there is value in ecological momentary assessment for all of our classic, you know, social cognitive constructs and whatnot. But it's not. It's not. Essential for the intervention to work. Exactly. And that's particularly critical when we think about future sustainability, right.
So if I'm thinking about, and I have friends at Google and whatnot, I've had some conversations about this, like to think about what they basically need is they need an algorithm that can be integrated into their technology in a computationally efficient way with people who could be able to monitor it in some sort of meaningful way. We were kind of actually calling Muhammad that Daniel's student who really kind of fleshed out the control algorithm.
This is for our other trial, the NCI funded project. We were kind of called him Art, the data science coach, because what he was basically doing is he could be monitoring the, you know, our 300, well, technically half of the sample, right? He could be monitoring a whole bunch of people. And what he's doing is he's not tweaking the algorithm for each person. He's looking for the anomalies.
Basically he's asking the question of like, who is not yet getting supported by this technology and that stuff that you can set up pretty easily and that scale with lots of people, right? And so that end goal has always been the frame of mind for us, right?
And So what we're doing is we're trying to demonstrate that we can build individualized model for each person in a highly computationally efficient way where basically every night this can be run, you know, we're running out on Amazon Web Services AWS server, right? Like, so totally normative structures and whatnot, it's all running automated. There's some degrees of checks, but it's the kind of checks that any good data scientist could do with this algorithm.
And so all you basically need is to pay for the server's time to actually run this. But particularly when you put it into sort of the computational horsepower that you'd need for generative AI versus what we're talking about. Like we're literally using like .1% of the computational energy that you would need compared to if you were trying to build a generative AI tool to do the exact same thing. Yeah. And I think the beauty of it is, or one of the beauties of it is that it's not a static
intervention, right? It's constantly changing. Keeping the. Boat on course over time as remind me of your mantra, things change. People are different. People are different. People are different. Context matters. Things change. Things change. Yeah, yeah. I'll always think of you when I hear that, but in more lessons. Yeah. And that's it. It's like that's why we call it a personalized perpetually adapting intervention, right?
It is personalized in that we are literally building an individual model for each person. Describe it perpetually adapting because we can have a controller that's basically always taking into the possibility that just because this was true for you in the past doesn't mean it's still true. And we can correct that. All right, now I want to do our zooming out. Let's forget about about this use case of physical activity and just think about living, experiencing the world.
And you know how we do this when you think about and, and perhaps this is too broad, you can narrow us down if you'd like. What? What sort of advice would you give people who are trying to consider whether their use case warrants using AI or not? Or, or maybe to put it closer to your language, how to decide what is the most useful approach, whether it's AI based or not?
Yeah, great question. And the way I tend to think about this is honestly just getting down to very pragmatic ways of thinking, you know, and so I think that I've been advocating more and more as a frame on client decision focused evidence production. What the heck do I mean by that? Basically what I'm trying to get at is, you know, in our field we're all about, you know, evidence based decision making, right? So in some sense this is just a
logical flip on that. The issue is, is the evidence that we have traditionally been trying to build out it's useful for some types of decisions, but not for all the decisions we're trying to make, right? So that's so with that is for in mind coming back to this like what we were just talking through, like if you go to a randomized control trial just to start in the kind of everyday world of of you know, the ozone, the behavioral science, behavioral medicine community.
That's a very useful methodology when you are trying to support the decision making of policy makers. If you want to build guidelines on what generally someone should do. Yep, everyone sees the same exact information. You're you're sort of attached to the one-size-fits-all approach. Right. And that's OK. I mean, it's, it's basically, it helps coming back to thinking about it as a decision orientation.
It's very valuable if a policy maker is just making a decision about an entire population and they're trying to decide what's the first thing I should offer someone it. That's what a frequentist estimate is, right? That is literally the result of a randomized control trial is a frequency estimate. You know, it's like 60% of the sample will get this and you know, whatever you know you get.
I mean, the question is, is that the only decision we need to actually support people making decisions? Hopefully. No, of course not, right. So I mean, the example we were just talking through was I'm actually trying to support the decision making of a control system to make a help an to help an individual make their decisions. That is a very individual temporal style of decision making, which requires a different way of evidence production.
There's a whole other line of work that I'm doing more and
¶ Community-Based Research and Context Matters
more on, which is actually a community work. And so this is I'm taking very consciously a place based work where we are consciously listening to community members, understanding their priorities and needs and otherwise, and then from that, trying to produce evidence that is aligned with their priorities, honoring their real world constraints, also critically building on their assets.
You know, we'd people are not just like this vague abstract, you know, notion, you know, people are living, breathing in a real world context and situations taken into account that can actually help you radically reduce complexity without compromising the complexity. And then finally, and this is an area my close colleague and friend Petya Klashnia that has been thinking a lot about is the notion of good enough evidence production.
You know, and I think it's just really obvious when you get into community work. You know, it's like, I can tell you how many different community groups I work with now. And, you know, they are so sick of being research. And they tell me that it's like, please, if you actually want to come in and research group, just leave me the hell alone, right? And here's why. They're basically like, they're so used to people coming in and saying, OK, I want to help you.
But the first step is I need to be able to do a needs assessment. I need to know what your needs actually are for the community. They're like you're the 17th academic from your institution that has come here to tell us what our needs are. It takes you three years to actually do that work. You do the work, You come back, you tell us what our needs are, and then we tell you those aren't our needs anymore because it took you too long. That's good enough. Evidence production.
We were not good enough. We didn't fit with the real world constraints. OK. So decision focused evidence production I think is a framework to actually come back to how do you actually answer the question of to AI or not to AI, right. It's basically the three things that I'm coming. It comes back to the variation of people or different context matters, things change. How do you operationalize that?
Ask the question, who are you producing evidence for or who is this algorithm that you're trying to build going to help? Is it going to be an individual human being across time? Is it going to be policy maker who's trying to make decisions for populations? Is it someone that has a very place based orientation like community leaders, community organizers or maybe leaders within a healthcare system where they're trying to do continuous quality improvement within their
healthcare system or domain? Each one of those sets up a very different starting condition on what is success and the local knowledge they need. OK, so that's so basically people are different. How do you take into account, listen to them on what their priorities actually are and build your evidence or your algorithm to serve their priorities? Second step, context matters.
How do you, what is this now? This is basically saying, OK, now that you know who you're trying to help, what is their context? What are they living in? And this is coming back to what are both potentially the constraints, resources they may or may don't have, particularly if you're working with underserved populations. They might not have the smartphone that you're expecting. They might not have the, the, the data access that you're expecting, right?
They might not have a level of digital literacy that's really critical for you to take into account. And there's just those things that you might have assumed that are just not appropriate to be assuming in this context or on the flip side, might have a great deal of assets and strengths. I can tell you like we've done a study with, again, with my colleague Antua Comb, Ben Blanca Melendez and others with our community work.
We use this digital tool called Streetwise and basically imagine it's like Yelp for social determinants of health. So what we did is we gathered about 8000 stories from 1500 San Diegans from a historically underserved populations and groups all around the county. And this was our way to kind of do this shared needs assessment development work.
And I could go into depths on this, but I won't just to jump down to the summary statement of the key thing related to the second point of context matters and assets. We asked them pretty open-ended questions that tell us about your community, you know, tell us the good things, the bad things, the things that need to be fixed. And guess the percentage of the stories that were positive from groups that if you looked at this. Yeah, I won't. I won't put you on the spot.
I mean, I would say probably pretty high. I would think it's fairly positive. Exactly. OK, good. I'm glad you made that. Guess. Oh, and because the stereotype is oh, these are marginalized communities, they're not they're going to be disinfect. Yes, yeah, yeah, OK. I mean, the way that the way that people are incentivized to write brands is a variation of poverty point. You know, we basically have to say we justify our needs because we're gonna go help the needy
and the poor, right? It sounds very patronizing. Yeah. Of course, right? But yes, about 50% of the stories are positive, right? That's critical. So there's also coming back to this sort of, you know, context matters. They can do account their constraints, but really listen to their strengths, build on people's strengths. They have them, they have assets, they have a bunch of them. Like our food culture is amazing in San Diego as an example, they they loved our food.
So last one, things change. OK, this is coming back to that sort of good enough evidence production. You basically need to ask the question of like, OK, what's the level of rigor you need to actually help you get your priorities in the real world, constraints you're working with? And just to be clear, this is something I had not done for a long time. You know, like in my earlier work, I was just trying to figure out this control system stuff. And it was just like, oh, my
God, this is insanely rigorous. But the times when this is this level of rigor have needed 95% of the time, it's that's silly. You just need a healthy relationship with a human being.
Yeah, You know, honestly, right, like it doesn't, it makes sense in certain contexts when you have the right resources and otherwise, right, as you brought up, like I think it actually makes this is an easy no brainer for Apple or Google or or a large company because this is we understand their priorities, we understand their context and their assets, their constraints. And this is totally good enough for their structure, right.
But if you're trying to work within a federally qualified Health Center, right, completely different context, this is not going, it's laughable, right? It's like this is not good enough, right? But can we do something? Can maybe we build something that's like a support aid for community health workers or promoteuras? Yeah, totally. I think there's some fantastic structures for us to kind of play through within that space,
right? It all depends on priorities of the people, context they're in and then building the evidence such that it is just rigorous enough to help them to work forward and do their work. And so now that is the question I would think about when I think about making any decision, including to AI or not to AI, basically what are those situations when the priorities, the context and the sort of temporal constraints are aligned to use it?
And if you haven't thought about that yet, the one I would advise you to think the most about is number 3. Generative AI models, I think are fascinating. I think they're really interesting, right? I think the large language models can be really powerful. I think there's some really interesting things about chat bots and otherwise that we're definitely thinking about and exploring, particularly if you start mixing together your large
language models. So it allows it to sort of like build a recursive learning structure within it. It's great. There's a. Lot of terrified of that. I'm terrified of that. I it's, it's well, that's we'll come back to that. You should be right. Like, because what we're basically building is we're building a simulacrum. We're building a space where we, we actually lose sight of reality and we get stuck into not knowing that we actually we're stuck where we think symbols are reality, right?
That that is the logical end game of generative AI, right? It, it's basically getting you into, you know, the matrix. So, so you should totally, you'd be scared of that, but that's a different conversation. I you would just ask the question of when would I want to use it, right? I would basically say when it's actually serving human beings in a loving, thoughtful, care and compassionate way that is not
manipulative, not exploitative. That's my restatement of prioritization when it fits into the context and constraints and it's not basically sucking resources out of others, right? If you need to be running these huge data farms that that forces you to like have so much energy sucked away that your need to some other part of the planet to do it, maybe you shouldn't do it. And when it's in the temporal constraints to work within it, right, such that it it can work.
Those would be the times and places when I would use it. Yeah, it's interesting. I think there's this sort of paradox that I have encountered when people talk about when to use generative AI and when to not use generative AI. And a lot of the talking points come down to don't use it for really high stakes important decisions because it can't be trusted. And make sure that you're an expert so that you can identify, you know, when, when it's gone
wrong. And so like really only use it for like, you know, summarizing things that aren't very important or structuring something that you you've already written. And I myself have said these kinds of things. But then when you think about, for example, the environmental toll of of this technology, then you think, wait, do we really want to?
So only use it for trivial things, but we need to actually spend billions and billions of dollars trying to prop up ChatGPT as it figures it out, and then it's trying to steal all of everyone's data records to basically build larger and larger language models. There is a logical contradiction that's deeply embedded into it, yes.
Yes. So personally, I would like for if we are going to make this sacrifice or this trade off, you could say I would like for it to be able to be useful in important contexts for high stakes decisions. But then I also backtrack myself and think, but do I want the machines making the decisions? And anyway, there's there's a big hole there as well. Actually, if you're not so, so, so sick of talking about to AI or not to AI, I'd love to move on to our quick fire round. Are you ready?
Awesome. Sure. So I'm going to give you a a bunch of different tasks and you're going to tell me whether that task is well suited to AI
¶ Quickfire Round: To AI or Not to AI
or not. So first one I will have to apologize for in advance. To AI or not to AIA Heckler at a live comedy show? A Heckler on a live comedy show? I imagine a human could be a funnier Heckler than AAI. Maybe, I would hope. OK an AI granny subscribes to call lists to waste scammers time an. AI granny subscribe to call list to waste it. I can see that. So this is actually a real
example. A telecom company in in the UK has developed this granny which people call and then she, you know, talks to them for hours and hours and it makes them very frustrated. As long as it's being used again for the, you know, basically getting telemarketers to not, you know, not like those who are in that exploitative space of taking advantage of people, yes. That is exactly the purpose. That's great. That sounds like a Mark Rober sort of thing that he did too. So that's that's great.
OK, I'm into that. A puppy friend for your aging golden retriever. I would rather just get a. Why not just get a dog? No, I don't see the need for that. Why replace life with like a proto subversion of the life seems. Fair design experiments for humans, so designing human subjects research. I wouldn't have it do it alone, but one way I've kind of thought of AI, particularly generative AII, think about it as almost like a fourth person
perspective. You know, it gives you the like a capacity to talk with the sort of knowledge of our species, you know, at least the conditional knowledge that we've collected, you know, through large language models and otherwise, right. And so I, I mean, my experience of generative AI tends to be it's a very handy, that insanity check of like, Hey, did I forget something? Can you help me think through
this and otherwise. And so as long as you're using it as in a loop with a human being where ultimately it is helping you to like refine the wisdom and and clarity of the of the person themselves to do better at their own work. I think that makes a lot of sense as a use if I'm having him do it by itself. No way. No. Yeah, I don't. I don't see that as helpful. So. OK, generate a personalized VR world, Virtual reality world to
optimize a user's well-being. Oh, we're definitely getting into like Ready Player One World now and stuff, right? So I, I, that would be one that I would be very leery of. It just feels like, you know, we've seen enough variations of that story played through sci-fi to know that there's some unintended consequences to be playing through.
Isn't that. And I would come back to, if you need to build something of that level of complexity again, how much computational power are you building for that? And is there something more useful for caring for the love and, you know, life of our planet that it should be going to. Yeah. Absolutely. I have one more for you. Eradicate death to AI or not to AI? Eradicate death? That doesn't make any sense to me because death is part of life right?
If if you actually allowed today for this people. No, I I am not into the idea of trying to live forever. All right, now I'm going to ask you the question that we ask all of our guests this season. What is your most controversial
opinion about AII? Think my most controversial opinion, probably variations of what I was just getting into, is I think we really love to outsource to technology and not recognize that we're living beautiful sentient beings that need to be cared for and loved and to be cultivated and actually cultivate our capacity to feel and to know and to have a sense on what is a good moral thing to do.
That is psychology. So if we don't actually have a robust psychology as a check on AI, you know, it's very much like David Hume and the sort of is ought thinking. We can do a whole lot of is and we can build a lot of things, but we don't actually have that grounding and and what we have what ought to be and therefore the likelihood of it creating
harm is much higher. To me, I think the role of behavioral science and psychology is vastly important within here to build the checks on these algorithms. Well, Eric, this was really lovely. Thank you so much for joining me and sharing all of your wisdom and going as deeply into all the details as I begged you. Thank you for everything. It was awesome. Thanks. It was fun, Eileen, I really appreciate. It and that's a wrap.
You've been listening to the behavioral design podcast brought to you by habit weekly and nuance behavior. Sam and Eileen tell me this season is packed with incredible insights about behavioral design and AI. So be sure to subscribe and share the podcast with with your friends, though you might want to keep it away from your enemies. In case you haven't noticed, I'm an AI voice. Yep, pretty crazy. Quite the improvement since last season's AI outro, don't you
think? If you'd like to collaborate with us at Nuance Behavior, where we use behavioral design to craft digital products with Nuance, e-mail us at [email protected] or book a call directly on our website, nuancebehavior.com. A special thanks to the amazing Dave Pizarro for our show music and to Mei Chen Yap and April English for their help in producing and publishing this episode. Thanks again for tuning in.
We'll be back soon with another exciting conversation where behavioral design and AI intersect. Happens to. Murgatroyd. The. The question is why? Yeah. Why? Tell me why. Tell me why, Tell me why.