Ideas: Solving network management puzzles with Behnaz Arzani - podcast episode cover

Ideas: Solving network management puzzles with Behnaz Arzani

Jun 13, 202444 min
--:--
--:--
Download Metacast podcast app
Listen to this episode in Metacast mobile app
Don't just listen to podcasts. Learn from them with transcripts, summaries, and chapters for every episode. Skim, search, and bookmark insights. Learn more

Episode description

Behnaz Arzani loves hard problems and the freedom to explore. That makes research a great fit! She discusses her work in network management, including the potential role of LLMs in the field; the challenges that excite her; and how storytelling changed her life.

Learn more:

Transcript

[TEASER] [MUSIC PLAYS UNDER DIALOGUE]

BEHNAZ ARZANI

I guess the thing I'm seeing is  that we are freed up to dream more—in a way.   Maybe that's me being too … I'm a little bit of a  romantic, so this is that coming out a little bit,   but it's, like, because of all this, we have  the time to think bigger, to dream bigger,   to look at problems where maybe five years  ago, we wouldn't even dare to think about.

[TEASER ENDS] GRETCHEN HUIZINGA: You’re listening to Ideas, a Microsoft Research Podcast that dives deep into  the world of technology research and the profound   questions behind the code. I'm Dr. Gretchen  Huizinga. In this series, we'll explore the   technologies that are shaping our future and  the big ideas that propel them forward. [MUSIC FADES]

My guest today is Behnaz Arzani. Behnaz is a  principal researcher at Microsoft Research,   and she's passionate about the systems and  networks that provide the backbone to nearly all   our technologies today. Like many in her field,  you may not know her, but you know her work:   when your networks function flawlessly, you  can thank people like Behnaz Arzani. Behnaz,   it's been a while. I am so excited to catch  up with you today. Welcome to Ideas!

Thank you. And  I'm also excited to be here.

HUIZINGA

So since the show is about ideas and  leans more philosophical, I like to start with   a little personal story and try to tease out  anything that might have been an inflection   point in your life, a sort of aha moment, or  a pivotal event, or an animating “what if,” we   could call it. What captured your imagination and  got you inspired to do what you're doing today?

ARZANI

I think that it was a little bit of an  accident and a little bit of just chance, I guess,   but for me, this happened because I don't like  being told what to do! [LAUGHTER] I really hate   being told what to do. And so, I got into research  by accident, mostly because it felt like a job   where that wouldn't happen. I could pick what I  wanted to do. So, you know, a lot of people come   talking about how they were the most curious kids  and they all—I wasn't that. I was a nerd, but I  

wasn't the most curious kid. But then I found  that I'm attracted to puzzles and hard puzzles   and things that I don't know how to answer, and  so that gravitated me more towards what I'm doing   today. Things that are basically difficult  to solve … I think are difficult to solve.

HUIZINGA

So that's your inspiring  moment? “I'm a bit of a rebel, and …”

ARZANI

Yup! HUIZINGA: … I like puzzles … ”? Yup! [LAUGHTER] Which is not really a  moment. Yeah, I can't point to a moment. It's   just been a journey, and it's just, like, been  something that has gradually happened to me,   and I love where I am, but I can't really pinpoint  to like this, like this inspiring awe-drop—no.

HUIZINGA

OK. So let me ask you this:   is there nobody in this building that  tells you what to do? [LAUGHS]

ARZANI

There are people who  have tried, [LAUGHS] but …

HUIZINGA

Oh my gosh!

ARZANI

No, it doesn't work. And I think if you  ask them, they will tell you it hasn't worked.

HUIZINGA

OK. The other side question is, have you  encountered a puzzle that has confounded you?

ARZANI

Have I encountered a puzzle?  Yes. Incident management. [LAUGHTER]

HUIZINGA

And we'll get there in the  next couple of questions. Before we do,   though, I want to know about who might  have influenced you earlier. I mean,   it's interesting. Usually if you don't have a  what, there might not be a who attached to it …

ARZANI

No. But I have a who. I  have multiple “whos” actually.

HUIZINGA

OK! Wonderful. So tell us a little bit  about the influential people in your life.

ARZANI

I think the first and foremost is my mom.  I have a necklace I'm holding right now. This is   something my dad gave my mom on their wedding day.  On one side of it is a picture of my mom and dad;   on the other side is both their names on it.  And I have it on every day. To my mom’s chagrin.   [LAUGHTER] She is like, why? But it's, like, it  helps me stay grounded. And my mom is a person   that … she had me while she was an undergrad. She  got her master’s. She got into three different PhD  

programs in her lifetime. Every time, she gave  it up for my sake and for my brother's sake. But   she's a woman that taught me you can do anything  you set your mind to and that you should always be   eager to learn. She was a chemistry teacher,  and even though she was a chemistry teacher,   she kept reading new books. She came to the  US to visit me in 2017, went to a Philadelphia   high school, and asked, can I see your chemistry  books? I want to see what you're teaching your  

kids. [LAUGHTER] So that's how dedicated she is  to what she does. She loves what she does. And I   could see it on her face on a daily basis. And at  some point in my life a couple of years ago, I was   talking to my mom about something, and she said,  tell yourself, “I'm stronger than my mom.”

HUIZINGA

Oh my gosh.

ARZANI

And that has been, like, the most  amazing thing to have in the back of my   head because I view my mom as one of  the strongest people I've ever met,   and she's my inspiration for everything I do.

HUIZINGA

Tell yourself you're  stronger than your mom. … Did you?

ARZANI

I'm not stronger than my  mom, I don't think … [LAUGHS]

HUIZINGA

[LAUGHS] You got  to change that narrative!

ARZANI

But, yes, I think it's just this thing of,   like, “What would Mom do?” is a great  thing to ask yourself, I think.

HUIZINGA

I love that. Well, and so  I would imagine, though, that post-,   you know, getting out of the house, you've  had instructors, you've had professors,   you've had other researchers. I  mean, anyone else that's … ?

ARZANI

Many! And in different stages of your  life, different people step into that role,   I feel like. One of the first people  for me was Jen Rexford, and she is   just an amazing human being. She's an amazing  researcher, hands down. Her work is awesome,   but also, she's an amazing human being, as  well. And that just makes it better.

HUIZINGA

Yeah.

ARZANI

And then another person is Mohammad  Alizadeh, who's at MIT. And actually, let’s see,   I'm going to keep going a little with  people—Mark Handley. When I was a PhD   student, I would read their papers, and I'd  be like, wow! And, I want to be like you!

HUIZINGA

So linking that back to your love  of puzzles, were these people that you admired   good problem solvers or … ?

ARZANI

Oh, yeah! I think Jen is one of those  who … a lot of her work is also practical,   like, you know, straddles a line between both  solving the puzzle and being practical and being   creative and working with theorists and working  with PL people. So she's also collaborative,   which is, kind of, my style of work, as  well. Mohammad is more of a theorist,  

and I love … like more the theoretical aspect  of problems that I solve. And so, like,   just the fact that he was able to look at those  problems and thinks about those problems in those   ways. And then Mark Handley’s intuition about  problems—yeah, I can't even speak to that!

HUIZINGA

That's so fascinating because you've  identified three really key things for a   researcher. And each one is embodied in a person.  I love that. And because I know who you are,   I know we're going to get to each of those things  probably in the course of all these questions   that I'll ask you. [LAUGHTER] So we just spent a  little time talking about what got you here and  

who influenced you along the way. But your life  isn't static. And at each stage of accomplishment,   you get a chance to reflect and, sort  of, think about what you got right,   what you got wrong, and where you want to go  next. So I wonder if you could take a minute   to talk about the evolution of your  values as a researcher, collaborator,   and colleague and then a sort of “how  it started/how it's going” thing.

ARZANI

Hmm … For me, I think what I've learned  is to be more mindful—about all of it. But I   think if I talk about the evolution, when  you're a PhD student, especially if you're   a PhD student from a place that's not MIT,  that's not Berkeley, which is where I was from,   my main focus was proving myself. I mean, for  women, always, we have to prove ourselves. But,   like, I think if you're not from one of those  schools, it's even more so. At least that's how  

I felt. That might not be the reality, but that's  how you feel. And so you're always running to show   this about yourself. And so you don't stop  to think how you're showing up as a person,   as a researcher, as a collaborator. You're  not even, like, necessarily reflecting on,   are these the problems that I enjoy solving?  It’s more of, will solving this problem help   me establish myself in this world that requires  proving yourself and is so critical and all of  

that stuff? I think now I stop more. I think more,  is this a problem that I would enjoy solving?   I think that's the most important thing. Would  other people find it useful? Is it solving a hard   technical question? And then, in collaborations,  I'm being more mindful that I show up in a way   that basically allows me to be a good person  the way I want to be in my collaboration. So   as researchers, we have to be critical because  that's how science evolves. Not all work is  

perfect. Not all ideas are the best ideas. That's  just fundamental truth. Because we iterate on each   other's ideas until we find the perfect solution  to something. But you can do all of these things   in a way that's kind, in a way that's mindful,  in a way that respects other people and what they   bring to the table. And I think what I've learned  is to be more mindful about those things.

HUIZINGA

How would you define mindful?  That's an interesting word. It has a lot   of baggage around it, you know, in terms  of how people do mindfulness training.   Is that what you're talking about, or  is it more, sort of, intentional?

ARZANI

I think it's both. So I think one of the  things I said—I think when I got into this booth   even—was, I'm going to take a breath before I  answer each question. And I think that's part   of it, is just taking a breath to make sure you're  present is part of it. But I think there is more   to it than that, which is I don't think we even  think about it. I think if I … when you asked me  

about the evolution of how I evolved, I never  thought about it. I was just, like, running to   get things done, running to solve the question,  running to, you know, find the next big thing,   and then you're not paying attention to how  you're impacting the world in the process.

HUIZINGA

Right.

ARZANI

And once you start paying attention, then  you're like, oh, I could do this better. I can do   that better. If I say this to this person in  that way, that allows them to do so much more,   that encourages them to do so much more.

HUIZINGA

Yeah, yeah.

ARZANI

So …

HUIZINGA

You know, when you started out,  you said, is this a problem I would enjoy   solving? And then you said, is this  a problem that somebody else needs to   have solved? Which is sort of like “do I like  it?"—it goes back to Behnaz at the beginning:   don't tell me what to do; I want to  do what I want to do. Versus—or and   is this useful to the world? And I feel like  those two threads are really key to you.

ARZANI

Yes. Basically, I feel like that defines  me as a researcher, pretty much. [LAUGHS] Which   is, you know, I was one of the, you know, early  people … I wouldn’t say first. I'm not the first,   I don't think, but I was one of the early people  who was talking about using machine learning in   networking. And after a while, I stopped  because I wasn't finding it fun anymore,   even though there was so much hype about, you  know, let's do machine learning in networking.  

And it's not because there's not a lot  of technical stuff left to do. You can   do a lot of other things there. There's room  to innovate. It's just that I got bored.

HUIZINGA

I was just going to say, it's still  cool, but Behnaz is bored! [LAUGHTER] OK, well,   let's start to talk a little bit about some of the  things that you're doing. And I like this idea of   a researcher, even a person, having a North Star  goal. It sounds like you've got them in a lot of   areas of your life, and you've said your North  Star goal, your research goal, is to make the life  

of a network operator as painless as possible. So  I want to know who this person is. Walk us through   a day in the life of a network operator and tell  us what prompted you to want to help them.

ARZANI

OK, so it's been years since I actually,  like, sat right next to one of them for a long   extended period of time because now we're in  different buildings, but back when I was an   intern, I was actually, like, kind of, like  right in the middle of a bunch of, you know,   actual network operators. And what I observed  … and see, this was not, like, I've never lived   that experience, so I'm talking about somebody  else's experience, so bear that in mind …

HUIZINGA

Sure, but at least you saw it …

ARZANI

Yeah. What they do is, there's a lot of,  “OK, we design the network, configure it.” A lot   of it goes into building new systems to manage it.  Building new systems to basically make it better,   more efficient, all of that. And then they  also have to be on call so that when any of   those things break, they're the ones who have  to look at their monitoring systems and figure   out what happened and try to fix it. So they  do all of this in their day-to-day lives.

HUIZINGA

That's tough …

ARZANI

Yeah.

HUIZINGA

OK. So I know you have a story about  what prompted you, at the very beginning,   to want to help this person. And it had  some personal implications. [LAUGHS]

ARZANI

Yeah! So my internship  mentor, who's an amazing person,   I thought—and this is, again, my perception  as an intern—the day after he was on call,   he was so tired, I felt. And so grumpy …  grumpier than normal! [LAUGHTER] And, like,   my main motivation initially for working in this  space was just, like, make his life better!

HUIZINGA

Make him not grumpy.

ARZANI

Yeah. Pretty much. [LAUGHS]

HUIZINGA

Did you have success at that  point in your life? Or was this just,   like, setting a North Star goal  that I'm going to go for that?

ARZANI

I mean, I had done a lot of work  in monitoring space, but back then—again,   going back to the talk we were having  about how to be mindful about problems   you pick—back then it was just like, oh, this  was a problem to solve, and we'll go solve it,   and then what's the next thing? So there was not  an overarching vision, if you will. It was just,  

like, going after the next, after the next. I  think that's a point where, like, it all came   together of like, oh, all of the stuff that I'm  doing can help me achieve this bigger thing.

HUIZINGA

Right. OK, Behnaz, I want to  drop anchor, to use a seafaring analogy,   for a second and contextualize the language that  these operators use. Give us a “networking for   neophytes” overview of the tools they  rely on and the terminology they use in   their day-to-day work so we're not lost  when we start to unpack the problems,   projects, and papers that  are central to your work.

ARZANI

OK. So I'm going to focus on my pieces of  this just because of the context of this question.   But a lot of operators … just because a lot of  the problems that we work on these days to be able   to manage our network, the optimal form of these  problems tend to be really, really hard. So a lot   of the times, we use algorithms and solutions that  are approximate forms of those optimal solutions  

in order to just solve those problems faster. And  a lot of these heuristics, some of them focus on   our wide area network, which we call a WAN. Our  WANs, basically what they do is they move traffic   between datacenters in a way that basically fits  the capacity of our network. And, yeah, I think   for my work, my current work, to understand it,  that's, I think, enough networking terminology.

HUIZINGA

OK. Well, so you've used the  term heuristic and optimal. Not with   an “s” on the end of it. Or you do  say “optimals,” but it's a noun …

ARZANI

Well, so for each problem  definition, usually, there's one way   to formulate an optimal solution. There  might be multiple optima that you find,   but the algorithm that finds the optimum  usually is one. But there might be many,   I guess. The ones that I've worked  on generally have been one.

HUIZINGA

Yeah, yeah. And so in terms of how  things work on a network, can you give us just   a little picture of how something moves  from A to B that might be a problem?

ARZANI

So, for example, we have these datacenters  that generate terabytes of traffic and—terabytes   per second of traffic—that wants to move  from point A to point B, right. And we only   have finite network capacity, and these, what we  call, “demands” between these datacenters—and you   didn't see me do the air quotes, but I did the  air quotes—so they go from point A to point B,   and so in order to fit this demand in the pipes  that we have—and these pipes are basically links  

in our network—we have to figure out how to  send them. And there's variations in them. So,   like, it might be the case that  at a certain time of the day,   East US would want to send more traffic to West  US, and then suddenly, it flips. And that's why   we solve this problem every five minutes! Now  assume one of these links suddenly goes down.   What do I do? I have to resolve this problem  because maybe the path that I initially picked  

for traffic to go through goes exactly through  that failed link. And now that it's disappeared,   all of that traffic is going to fall on the floor.  So I have to re-solve that problem really quickly   to be able to re-move my traffic and move it  to somewhere else so that I can still route   it and my customers aren't impacted. What  we're talking about here is a controller,   essentially, that the network operators built.  And this controller solves this optimization  

problem that figures out how traffic should move.  When it's failed, then the same controller kicks   in and reroutes traffic. The people who built  that controller are the network operators.

HUIZINGA

And so who does the problem-solving  or the troubleshooting on the fly?

ARZANI

So hopefully—and this, most of the times,   is the case—is we have monitoring systems in  place that the operators have built that, like,   kind of, signal to this controller that, oh, OK,  this link is down; you need to do something. [MUSIC BREAK]

HUIZINGA

Much of your recent work  represents an effort to reify the   idea of automated network management and  to try to understand the performance of   deployed algorithms. So talk about the  main topics of interest here in this   space and how your work has evolved in an era  of generative AI and large language models.

ARZANI

So if you think about it,  what generative AI is going to enable,   and I'm using the term “going to enable” a little  bit deliberately because I don't think it has yet.   We still have to build on top of what we have to  get that to work. And maybe I'll reconsider my   stance on ML now that, you know, we have these  tools. Haven't yet but might. But essentially,  

what they enable us to do is take automated action  on our networks. But if we're allowing AI to do   this, we need to be mindful of the risks because  AI in my, at least in my head of how I view it,   is a probabilistic machine, which, what that means  is that there is some probability, maybe a teeny   tiny probability, it might get things wrong.  And the thing that you don't want is when it  

gets things wrong, it gets things catastrophically  wrong. And so you need to put guardrails in place,   ensure safety, figure out, like, for each action  be able to evaluate that action and the risks it   imposes long term on your network and whether  you're able to tolerate that risk. And I think   there is a whole room of innovation there  to basically just figure out the interaction   between the AI and the network and where … and  actually strategic places to put AI, even.

HUIZINGA

Right.

ARZANI

The thing that for me has evolved is  I used to think we just want to take the human   out of the equation of network management.  The way I think about it now is there is a   place for the human in the network management  operation because sometimes human has context  

and that context matters. And so I think what  the, like, for example, we have this paper in   HotNets 2023 where we talk about how to put an LLM  in the incident management loop, and then there,   we carefully talk about, OK, these are the places  a human needs to be involved, at least given where   LLMs are right now, to be able to ensure  that everything happens in a safe way.

HUIZINGA

So go back to this “automated  network management” thing. This sounds   to me like you're in a space where it  could be, but it isn't ready yet …

ARZANI

Yeah.

HUIZINGA

… and without, sort of, asking  you to read a crystal ball about it,   do you feel like this is something  that could be eventually?

ARZANI

I hope so. This is the best thing  about research. You get to be like, yeah!

HUIZINGA

Yeah, why not?

ARZANI

Why not? And, you know,  maybe somebody will prove me wrong,   but until they do, that's  what I'm working towards!

HUIZINGA

Well, right now it's  an animating “what if?”

ARZANI

Yeah.

HUIZINGA

Right?

ARZANI

Yeah.

HUIZINGA

This is a problem Behnaz is  interested in right now. Let's go!

ARZANI

Yeah. Pretty much. [LAUGHTER]

HUIZINGA

OK. Behnaz, the systems and  networks that we've come to depend on   are actually incredibly complex. But  for most of us, most of the time,   they just work. There's only drama when they  don't work, right? But there's a lot going on   behind the scenes. So I want you to talk a  little bit about how the cycle of configuring,   managing, reconfiguring, etc.,  helps keep the drama at bay.

ARZANI

Well … you reminded me of something! So  when I was preparing my job … I'm going to tell   this story really, really quickly. But when I was  preparing my job talk, somebody showed me a tweet.   In 2014, I think, people started calling 911  when Facebook was down! Because of a networking   problem! [LAUGHS] Yeah. So that's a thing.  But, yeah, so network availability matters,  

and we don't notice it until it's actually down.  But that aside, back to your question. So I think   what operators do is they build systems in  a way that tries to avoid that drama as much   as possible. So, for example, they try to build  systems that these systems configure the network.   And one of my dear friends, Ryan Beckett, works on  intent-driven networking that essentially tries to   ensure that what the operators intend with their  configurations matches what they actually push  

into the network. They also monitor the network  to ensure that as soon as something bad happens,   automation gets notified. And there's automation  also that tries to fix these problems when they   happen as much as possible. There's a couple of  problems that happen in the middle of this. One  

of them is our networks continuously change,  and what we use in our networks changes. And   there's so many different pieces and components of  this, and sometimes what happens is, for example,   a team decides to switch from one protocol  to a different protocol, and by doing that,   it impacts another team's systems and monitoring  and what expectations they had for their systems,   and then suddenly it causes things to go bad.  And they have to develop new solutions taking  

into account the changes that happened. And  so one of the things that we need to account   for in this whole process is how evolution  is happening. And like evolution-friendly,   I guess, systems, maybe, is how  you should be calling it.

HUIZINGA

Right.

ARZANI

But that's one. The other  part of it that goes into play is,   most of the time you expect a particular  traffic characteristic, and then suddenly,   you have one fluke event that, kind of, throws  all of your assumptions out the window, so …

HUIZINGA

Right. So it's a never-ending job …

ARZANI

Pretty much.

HUIZINGA

It's about now that I ask all  my guests what could possibly go wrong if,   in fact, you got everything right. And so  for you, I'd like to earth this question   in the broader context of automation and the  concerns inherent in designing machines to   do our work for us. So at an earlier  point in your career—we talked about   this already—you said you believed you could  automate everything. Cool. Now you're not so  

much on that. Talk about what changed your  thinking and how you're thinking now.

ARZANI

OK, so the shallow answer to  that question—there's a shallow answer,   and there's a deeper answer—the shallow answer  to that question is I watched way too many   movies where robots took over the world.  And honestly speaking, there's a scenario   that you can imagine where automation starts  to get things wrong and then keeps getting  

things wrong, and wrong, not by the  definition of automation. Maybe they're   doing things perfectly by the objectives and  metrics that you used to design them …

HUIZINGA

Sure.

ARZANI

… but they're screwing things up in  terms of what you actually want them to do.

HUIZINGA

Interesting.

ARZANI

And if everything is automated and  you don't leave yourself an intervention plan,   how are you going to take control back?

HUIZINGA

Right. So this goes back to the  humans-in-the-loop/humans-out-of-the-loop.   And if I remember in our last podcast, we  were talking about humans out of the loop.

ARZANI

Yeah.

HUIZINGA

And you've already talked a  bit about what the optimal place for a   human to be is. Is the human always going to  have to be in the loop, in your opinion?

ARZANI

I think it's a scenario where you  always give yourself a way to interrupt. Like,   always put a back door somewhere. When we notice  things go bad, we have a way that's foolproof that   allows us to shut everything down and take control  back to ourselves. Maybe that's where we go.

HUIZINGA

How do you approach  the idea of corner cases?

ARZANI

That's essentially what my research  right now is, actually! And I love it,   which is essentially figuring out, in a  foolproof way, all the corner cases.

HUIZINGA

Yeah?

ARZANI

Can you build a tool that will tell  you what the corner cases are? Now, granted,   what we focus on is performance  corner cases. Nikolaj Bjørner,   in RiSE—so RiSE is Research in  Software Engineering—is working on,   how do you do verification corner cases? But all  of them, kind of, have a hand-in-hand type of,   you know, Holy Grail goal, which is,  how do you find all the corner cases?

HUIZINGA

Right. And that, kind of,  is the essence of this “What could   possibly go wrong?” question,  is looking in every corner …

ARZANI

Correct.

HUIZINGA

… for anything that could go wrong.  So many people in the research community have   observed that the speed of innovation in  generative AI has shrunk the traditional   research-to-product timeline, and some people  have even said everyone's an applied researcher   now. Or everyone's a PM. [LAUGHS] Depends  on who you are! But you have an interesting   take on this Behnaz, and it reminds me  of a line from the movie Nanny McPhee:  

“When you need me but do not want me, then I will  stay. When you want me but no longer need me,   I have to go.” So let's talk a little  bit about your perspective on this   idea-to-ideation pipeline. How and where are  researchers in your orbit operating these days,   and how does that impact what we might  call “planned obsolescence” in research?

ARZANI

I guess the thing I'm seeing is that we  are freed up to dream more—in a way. Maybe that's   me being too … I'm a little bit of a romantic, so  this is that coming out a little bit, but it's,   like, because of all this, we have the time  to think bigger, to dream bigger, to look at   problems where maybe five years ago, we wouldn't  even dare to think about. We have amazingly,  

amazingly smart, competent people in our product  teams. Some of them are actually researchers. So   there's, for example, the Azure systems research  group that has a lot of people that are focused   on problems in our production systems. And  then you have equivalents of those spread   out in the networking sphere, as well. And so  a lot of complex problems that maybe like 10   years ago Microsoft Research would look  at nowadays they can handle themselves.  

They don't need us. And that's part of what  has allowed us to now go and be like, OK,   I'm going to think about other things. Maybe  things that, you know, aren't relevant to you   today, but maybe in five years, you'll come  in and thank me for thinking about this!

HUIZINGA

OK. Shifting gears here! In a recent  conversation, I heard a coelleague refer to you   as an “idea machine.” To me, that's one of the  greatest compliments you could get. But it got   me wondering, so I'll ask you: how does your  brain work, Behnaz, and how do you get ideas?

ARZANI

Well, this has been, to my chagrin, one of  the realities of life about my brain apparently.   So I never thought of this as a strength. I always  thought about it as a weakness. But nowadays, I'm   like, oh, OK, I'm just going to embrace this now!  So I have a random brain. It’s completely ran—so,   like, it actually happens, like, you're talking,  and then suddenly, I say something that seems to   other people like it came out of left field. I  know how I got there. It’s essentially kind of  

like a Markov chain. [LAUGHTER] So a Markov  chain is essentially a number of states,   and there's a certain probability you can  go from one state to the other state. And,   actually, one of the things I found out  about myself is I think through talking   for this exact reason. Because people see  this random Markov chain by what they say,  

and it suddenly goes into different places,  and that's how ideas come about. Most of   my ideas have actually come through  when I've been talking to someone.

HUIZINGA

Really?

ARZANI

Yeah.

HUIZINGA

Them talking or you talking?

ARZANI

Both.

HUIZINGA

Really?

ARZANI

So it's, like, basically, I think  the thing that has recently … like, I've just   noticed more—again, being more mindful does that  to you—it's like I'm talking to someone. I'm like,   I have an idea. And it's usually they said  something, or I was saying something that  

triggered that thought coming up. Which doesn't  happen when … I'm not one of those people that you   can put in a room for three days—somebody actually  once told me this— [LAUGHTER] like, I'm not one   of those people you can put in a room for three  days and I come out with these brilliant ideas.   It's like you put me in a room with five other  people, then I come out with interesting ideas.

HUIZINGA

Right. … It's the interaction.

ARZANI

Yeah.

HUIZINGA

I want to link this idea of the  ideas that you get to the conversations   you have and maybe go back to linking it  to the work you've recently done. Talk   about some of the projects, how they came  from idea to paper to product even …

ARZANI

Mm-hm. So like one of the works  that we were doing was this work on, like,   max-min fair resource allocation that recently got  published in NSDI and is actually in production.   So the way that came out is I was working with  a bunch of other researchers on risk estimation,   actually, for incident management of all things,  which was, how do you figure out if you want to  

mitigate a particular problem in a certain way,  how much risk it induces as a problem. And so one   of the people who was originally … one of the  original researchers who built our wide-area   traffic engineering controller, which we were  talking about earlier, he said, “You're solving   the max-min fair problem.” We're like, really?  And then this caused a whole, like, one-year   collaboration where we all sat and evolved this  initial algorithm we had into a … So initially  

it was not a multipath problem. It had a lot of  things that didn't fully solve the problem of   max-min fair resource allocation, but it evolved  into that. Then we deployed it, and it improved   the SWAN solver by a factor of three in terms  of how fast it solved the problem and didn't   have any performance impact, or at least very  little. And so, yeah, that's how it got born.

HUIZINGA

OK. So for those of us  who don't know, what is max-min   fair resource allocation, and  why is it such a problem?

ARZANI

Well, so remember I said that in our wide  area network, we route traffic from one place to   the other in a way that meets capacity. So one  of the objectives we try to meet is we try to be   fair in a very specific metric. So max-min is just  the metric of fairness we use. And that basically   means you cannot improve what you allocated  to one piece of traffic in a way that would   hurt anybody who has gotten less. So there's  a little bit of a, like, … it's a mind bend  

to wrap your head a little bit around the max-min  fair definition. But the reason making it faster   is important is if something fails, we need to  quickly recompute what the paths are and how we   route traffic. So the faster we can solve this  problem, the better we can adapt to failures.

HUIZINGA

So talk a little bit about  some of the work that started as an   idea and you didn't even maybe know that  it was going to end up in production.

ARZANI

There was this person from Azure  Networking came and gave a talk in our group. And   he's a person I've known for years, so I was like,  hey, do you want to jump on a meeting and talk?   So he came into that meeting, and I was like,  OK, what are some of the things you're curious   about these days? You want to answer these  days? And it was like, yeah, we have this   heuristic we're using in our traffic engineering  solution, and essentially what it does is to make  

the optimization problem we solve smaller. If a  piece of traffic is smaller than a particular,   like, arbitrary threshold, we just send it on  a shortest path and don't worry about it. And   then we optimize everything else. And I just want  to know, like, what is the optimality gap of this   heuristic? How bad can this heuristic be? And  then I had worked on Stackelberg games before,   in my PhD. It never went anywhere, but it was an  idea I played around with, and it just immediately  

clicked in my head that this is the same problem.  So Stackelberg games are a leader-follower game   where in this scenario a leader has an objective  function that they're trying to maximize, and they   control one or multiple of the inputs that their  followers get to operate over. The followers,   on the other hand, don't get to control anything  about this input. They have their own objective   that they're trying to maximize or minimize,  but they have other variables in their control,  

as well. And what their objective is, is  going to control the leader's payoff. And   so this game is happening where the leader  has more control in this game because it's,   kind of, like the followers are operating  in subject to whatever the leader says,   right. But the leader is impacted by what the  followers do. And so this dynamic is what they   call a Stackelberg game. And the way we map the  MetaOpt problem to this is the leader in our  

problem wants to maximize the difference between  the optimal and the heuristic. It controls the   inputs to both the optimal and the heuristic.  And now this optimal and heuristic algorithms   are the followers in that game. They don't  get to control the inputs, but they have other   variables they control, and they have objectives  that they want to maximize or minimize.

HUIZINGA

Right.

ARZANI

And so that's how the Stackelberg-game  dynamic comes about. And then we got other   researchers in the team involved, and then  we started talking, and then it just evolved   into this beast right now that is a tool, MetaOpt,  that we released, I think, a couple of months ago.   And another piece that was really cool was  people from ETH Zürich came to us and were like,   oh, you guys analyzed our heuristic! We have a  better one! Can you analyze this one? And that  

was a whole fun thing we did where we analyzed  their heuristics for them. And, then, yeah ...

HUIZINGA

Yeah. So all these things that you're  mentioning, are they findable as papers? Were   they presented at conferences, and where  are they in anybody's usability scenario?

ARZANI

So the MetaOpt tool that  I just mentioned, that one is in …   it’s an open-source tool. You can go online and  search for MetaOpt. You'll find the tool. We're   here to support anything you need; if you  run into issues, we’ll help you fix it.

HUIZINGA

Great. You can probably find  all of these papers under publications   on your bio page on the website,  Microsoft Research website. Cool. If anyone wants to do that. So,  Behnaz, the idea of having ideas is cool to me,   but of course, part of the research  problem is identifying which ones you   should go after [LAUGHS] and which  ones you shouldn't. So, ironically,   you've said you're not that good at that part  of it, but you're working at getting better.

ARZANI

Yes.

HUIZINGA

So first of all, why  do you say that you're not very   good at it? And second of all,  what are you doing about it?

ARZANI

So I, as I said, get attracted to  puzzles, to hard problems. So most of the   problems that I go after are problems I have no  idea how to solve. And that tends to be a risk.

HUIZINGA

Yeah.

ARZANI

Where I think people who are better at  selecting problems are those who actually have   an idea of whether they'll be able to solve this  problem or not. And I never actually asked myself   that question before this year. [LAUGHTER]  So now I'm trying to get a better sense of,   how do I figure out if a problem is solvable  or not before I try to solve it? And also,  

just what makes a good research problem?  So what I'm doing is, I'm going back to the   era that I thought had the best networking  papers, and I'm just trying to dissect what   makes those papers good, just to understand  better for myself, to be like, OK, what do   I want to replicate? Replicate, not in terms  of techniques, but in terms of philosophy.

HUIZINGA

So what you're looking at is  how people solve problems through the work   that they did in this arena. So what are you  finding? Have you gotten any nuggets of …

ARZANI

So a couple. So one of my favorite  papers is Van Jacobson's TCP paper. The   intuition is amazing to me. It’s almost like he  has a vision of what's happening, is the best I   can describe it. And another example of this is  also early-on papers by people like Ratul Mahajan,   Srikanth Kandula, those guys, where you see that  they start with a smaller example that, kind of,   shows how this problem is going to happen and  how they're going to solve it. I mean, I did this  

in my work all the time, too, but it was never  conscious. It's more of like that goes to that   mindfulness thing that I said before, too. It's  like you might be doing some of these already,   but you don’t notice what you're doing. It  more of is, kind of, like putting of like,   oh, this is what they did. And I do this,  too. And this might be a good habit to keep   but cultivate into a habit as opposed to an  unconscious thing that you're just doing.

HUIZINGA

Right. You know, this whole idea  of going back to what's been done before,   I think that's a lesson about looking at  history, as well, and to say, you know,   what can we learn from that? What are we  trying to reinvent that maybe doesn't need   to be reinvented? Has it helped you to get more  targeted on the kinds of problems that you say,   “I'm not going to work on that.  I am going to work on that”?

ARZANI

To be very, very, very fair, I haven't  done this for a long time yet! This has been …

HUIZINGA

A new thing. ARZANI: I started this this month, yeah. Oh my goodness!

ARZANI

So we’ll see how far I get and  how useful it ends up being! [LAUGHS] [MUSIC BREAK]

HUIZINGA

One of my favorite things to talk about  on this show is what my colleague Kristina calls   “outrageous” lines of research. And so I've  been asking all my guests about their most   outrageous ideas and how they turned out.  So sometimes these ideas never got off the   ground. Sometimes they turned out great. And  other times, they've failed spectacularly. Do   you have a story for the “Microsoft  Research Outrageous Ideas” file?

ARZANI

I had this question of, if language has  grammar, and grammar is what LLMs are learning,   which, to my understanding of what people who are  experts in this field say, this maybe isn't that,   but if it is the case that grammar is what  allows these LLMs to learn how language works,   then in networking, we have the equivalent  of that, and the equivalent of that is   essentially network protocols. And  everything that happens in a network,  

you can define it as an event that happens in a  network. You can think of those, like, the events   are words in a language. And so, is it going to  be the case, and this is a question which is,   if you take an event abstraction and encode  everything that happens in a network in that   event abstraction, can you build an equivalent  of an LLM for networks? Now what you would use  

it for—this is another reason I've never worked  on this problem—I have no idea! [LAUGHTER] But   what this would allow you to do is build  the equivalent of an LLM for networking,   where actually you just translate that network's  events into, like, this event abstraction,   and then the two understand each other. So  like a universal language of networking,   maybe. It could be cool. Never tried it.  Probably a dumb idea! But it's an idea.

HUIZINGA

What would it take to try it?

ARZANI

Um … I feel like bravery is, I  think, one because with any risky idea,   there's a probability that you will fail.

HUIZINGA

As a researcher here at Microsoft  Research, when you have this idea,   um … and you say, well, I'm not brave  enough … even if you were brave enough,   who would you have to convince  that they should let you do it?

ARZANI

I don't think anybody!

HUIZINGA

Really?

ARZANI

That's the whole … that's  the whole point of me being here!   I don't like being told what to do! [LAUGHS]

HUIZINGA

Back to the beginning!

ARZANI

Yeah. The only thing is that, maybe, like,   people would be like, what have you  been doing in the past six months? And   I wouldn't have ... that's the risk.  That's where bravery comes in. The bravery is more of there is  a possibility that I have to devote three   years of my life into this, to figuring out  how to make that work, and I might not be   able to. And there's other things. So it's a  tradeoff also of where you put your time.

HUIZINGA

Sure.

ARZANI

So there. Yeah.

HUIZINGA

And if, but … part of it would be  explaining it in a way to convince people:   if it worked, it would be amazing!

ARZANI

And that's the other problem with  this idea. I don't know what you would   use it for. If I knew what you would use it  for, maybe then it would make it worth it.

HUIZINGA

All right. Sounds like you  need to spend some more time …

ARZANI

Yeah.

HUIZINGA

…ruminating on it. Um, yeah. The whole  cliché of the solution in search of a problem.

ARZANI

Yeah.

HUIZINGA

[LAUGHS] As we close, I want to talk  a little bit about some fun things. And so,   aside from your research life, I was  intrigued by the fact, on your bio page,   that you have a rich artistic life, as well, and  that includes painting, music, writing, along with   some big ideas about the value of storytelling. So  I'll take a second to plug the bio page. People,  

go look at it because she's got paintings and  cool things that you can link to. As we close,   I wonder if you could use this time to  share your thoughts on this particular   creative pursuit of storytelling and  how it can enhance our relationships   with our colleagues and ultimately make us  better researchers and better people?

ARZANI

I think it's not an understatement to  say I had a life-changing experience through   storytelling. The first time I encountered it,  it was the most horrific thing I had ever seen!   I had gone on Meetup—this was during COVID—to  just, like, find places to meet people, build   connections and all that, and I saw this event  called “Storytelling Workshop,” and I was like,  

good! I'm good at making up stories, and, you  know, that's what I thought it was. Turns out   it's, you go and tell personal stories about  your life that only involve you, that make you   deeply vulnerable. And, by the way, I'm Iranian.  We don't do vulnerability. It’s just not a thing.   So it was the most scary thing I've ever done  in my life. But you go on stage and basically  

talk about your life. And the thing it taught  me by both telling my own stories and listening   to other people's stories is that it showed me  that you can connect to people through stories,   first of all. The best ideas come when you're  actually in it together. Like one of the things   that now I say that I didn't used to say, we,  we're all human. And being human essentially   means we have good things about ourselves and  bad things about ourselves. And as researchers,  

we have our strengths as researchers, and we  have our weaknesses as researchers. And so   when we collaborate with other people, we bring  all of that. And collaboration is a sacred thing   that we do where we're basically trusting each  other with bringing all of that to the table   and being that vulnerable. And so our job as  collaborators is essentially to protect that,  

in a way, and make it safe for everybody  to come as they are. And so I think that's   what it taught me, which is, like,  basically holding space for that.

HUIZINGA

Yeah. How's that working?

ARZANI

First of all, I stumbled into it,   but there are people who are  already “that” in this building …

HUIZINGA

Really?

ARZANI

… that have been for years. It's just  that now I can see them for what they bring,   as opposed to before, I didn't  have the vocabulary for it.

HUIZINGA

Gotcha …

ARZANI

But people who don't, it’s like what  I've seen is almost like they initially look   at you with skepticism, and then they think it's  a gimmick, and then they are like, what is that?   And then they become curious, and then  they, too, kind of join you, which is very,   very interesting to see. But, like, again,  it’s something that already existed. It's   just me not being privileged enough to know  about it or, kind of, recognize it before.

HUIZINGA

Yeah. Can that become part of a culture,   or do you feel like it is part of the  culture here at Microsoft Research, or … ?

ARZANI

I think this depends on how  people individually choose to show   up. And I think we're all, at the end  of the day, individuals. And a lot of   people are that way without knowing they are  that way. So maybe it is already part of the   culture. I haven't necessarily sat down and  thought about it deeply, so I can't say.

HUIZINGA

Yeah, yeah. But it would  be a dream to have the ability to be   that vulnerable through storytelling  as part of the research process?

ARZANI

I think so. We had a storytelling coach  that would say, “Tell your story, change the   world.” And as researchers, we are attempting  to change the world, and part of that is our   stories. And so maybe, yeah! And basically, what  we're doing here is, I'm telling my story. So …

HUIZINGA

Yeah.

ARZANI

… maybe you're changing the world!

HUIZINGA

You know, I'm all in! I'm here for  it, as they say. Behnaz Arzani. It is such a   pleasure—always a pleasure—to talk to you. Thanks  for sharing your story with us today on Ideas.

ARZANI

Thank you.

[MUSIC]

Transcript source: Provided by creator in RSS feed: download file
For the best experience, listen in Metacast app for iOS or Android