How OpenAI’s engineering org is reshaping teams, roles and workflows w/ Sulman Choudhry #250 - podcast episode cover

How OpenAI’s engineering org is reshaping teams, roles and workflows w/ Sulman Choudhry #250

Mar 10, 202644 min
--:--
--:--
Download Metacast podcast app
Listen to this episode in Metacast mobile app
Don't just listen to podcasts. Learn from them with transcripts, summaries, and chapters for every episode. Skim, search, and bookmark insights. Learn more

Summary

Sulman Choudhry, Head of ChatGPT at OpenAI, discusses how AI is transforming engineering teams, roles, and workflows. He explains the shift to fluid, mission-driven teams, the application of vertical and horizontal structures, and the importance of cross-functional collaboration. The conversation also covers the Directly Responsible Individual (DRI) framework, AI's impact on manager roles, two-way mentorship in an AI-native environment, and the critical skill of problem framing for modern engineers.

Episode description

In this episode, recorded live at the OpenAI studio, Sulman Choudhry (Head of ChatGPT @ OpenAI) pulls back the curtain on how they structure engineering teams! We talk about shifting from silos to fluid mission-driven teams, vertical vs. horizontal teams, maximizing cross-functional collaboration between research, engineering, product and design. Plus we cover “directly responsible individuals” for high accountability, managers as systems designers, scaling decision-making to prevent leadership from becoming bottlenecks, frameworks for mentoring junior engineers, why “problem framing” is the most critical skill, and how managers can stay close to problems and maintain technical intuition.

 

ABOUT SULMAN CHOUDHRY

Sulman leads ChatGPT Engineering at OpenAI, driving the development and scaling of one of the world’s most impactful AI products. He pushes the boundaries of innovation by turning cutting‑edge research into practical, accessible tools that transform how people interact with technology. Previously at Meta, Sulman founded and scaled Instagram Reels, IGTV, and Instagram Labs, and helped lead the early development of Instagram Stories.

He also brought MetaAI to Instagram and Messenger, integrating generative AI into experiences used by billions. Earlier in his career, Sulman was on the founding team that built and launched UberEATS from the ground up, helping turn it into a global food delivery platform. With a track record of marrying technical vision, product strategy, and large‑scale execution, Sulman focuses on building products that meaningfully change how people live, work, and connect.

 

This episode is brought to you by xMatters!

xMatters automates the entire incident lifecycle with their purpose-built AI powered workflow, giving your team the context they need to stop disruptions before they start and minimize resolution times.

Head over to xmatters.com to learn more!

 

SHOW NOTES:
  • The Shift to AI-Native Engineering: How AI is collapsing the "Inner Loop" and reshaping engineering team composition (2:48)
  • Mission-Driven Teams: Moving from traditional functional silos to integrated, problem-centric units (4:45)
  • Vertical vs. Horizontal Team Architecture: How OpenAI structures specialized horizontal teams (ex. Infrastructure, RTC/Voice) with product verticals (7:04)
  • Fluid org charts & blurring functional roles: AI-Native teams require proactive mission alignment and coordination over rigid structure (8:48)
  • The Lifecycle of Problem-Oriented Teams: What happens when a "strike team" solves the problem (10:02)
  • Maximizing cross functional collaboration between engineering, research, product and design (11:52)
  • The DRI Framework: Implementing the "Directly Responsible Individual" model for high-velocity accountability (13:32)
  • Thriving in the "Chaos Factory": Addressing bottlenecks in highly dynamic, high-volume environments (16:02)
  • Prioritization & "Letting 1,000 Flowers Bloom": How OpenAI decides which AI bets to double down on (19:13)
  • Scaling Decision-Making: Preventing leadership from becoming the bottleneck as volume increases (21:19)
  • Knowing when to call it quits on a bet and reallocate talent for maximum impact (23:29)
  • The Manager as "Systems Designer": Shifting the EM role from people logistics to technical orchestration (24:49)
  • The Barbell Talent Strategy: Optimizing for innovation by pairing "super seniors" with "super juniors" (28:10)
  • Mentorship in the AI Age: How to coaching junior engineers when the "cost of code" is approaching zero (30:19)
  • Technical Intuition for Leaders: Sulman’s frameworks for staying "close to the metal" as a manager (33:17)
  • Cultivating Judgment: Why "Problem Framing" is the most critical skill for the modern engineer (37:01)
  • Rapid fire questions (38:59)

 

LINKS AND RESOURCES:

 

This episode wouldn’t have been possible without the help of our incredible production team:

Patrick Gallagher - Producer & Co-Host

Jerry Li - Co-Host

Noah Olberding - Associate Producer, Audio & Video Editor https://www.linkedin.com/in/noah-olberding/

Dan Overheim - Audio Engineer, Dan’s also an avid 3D printer - https://www.bnd3d.com/

Ellie Coggins Angus - Copywriter, Check out her other work at https://elliecoggins.com/about/


Hosted by Simplecast, an AdsWizz company. See pcm.adswizz.com for information about our collection and use of personal data for advertising.

Transcript

Intro / Opening

We're doing a special in-episode feature on the future of AI-powered incident management with our friends and sponsor X Matters. People as a primary integration layer is really fragile. With multiple people and all of that coordination, you become slower to find the root cause. The slower you find the root cause, you then don't know what action you need to take to resolve it. Getting to that fast is the goal.

Later in the episode, Mike Bennett, who leads the engineering team at X Matters, shares why human-driven coordination creates outage risk and how AI-powered orchestration can dramatically accelerate your path from event to resolution. OpenAI is a company where we do spend some amount of time on on planning yearly, quarterly, even monthly type basis, just because if these these

loops of development are just tightening more and more and more. Part of the planning is to really have some statements around here's what success looks like for us on this time horizon. What problem are we solving and being very very clear on who is solving that problem. And the further out you go, the more imprecise it gets or the more um higher level that it gets.

And then from there, there's idea of like, okay, so here's what success for us looks like in this time horizon. Here's all the ways that we're going to measure that. And and we do a fairly good job at disseminating that to the entire company and just just to make sure that we're extremely high trust environment as well where People don't consider these to be competitive to each other. They're willing to or they have that humility to learn from each other as well and their explosions.

Hello and welcome to the Engineering Leadership Podcast brought to you by ELC, the engineering leadership community. I'm Jerry Lee, founder of ELC. And I'm Patrick Gallagher, and we're your host. Our show shares the most critical perspectives, habits, and examples of great software engineering leaders to help evolve leadership in the tech industry. In this episode, we joined Solman Chowdhury, head of ChatGPT at OpenAI, live at the OpenAI podcast.

Podcast studio and office for an expansive conversation on how their engineering organization roles and workflows are changing. We talk about OpenAI's shift from silos to fluid mission-driven teams, running vertical versus horizontal team structures.

Maximizing cross-functional collaboration between research, engineering, product, and design, plus the directly responsible individual framework for high accountability, managers as system designers, scaling decision-making to prevent leadership from becoming bottlenecks.

Frameworks for mentoring junior engineers, why problem framing is one of the most critical skills for engineers right now, and how engineering managers can stay close to problems and maintain technical intuition. Enjoy our conversation with Solman Chowdry. So you and I talked a couple weeks ago to sort of ideate where we wanted to go. Since then, we had three different dinners with CTOs, um sort of at a lot of different scales.

of of company size and team size. And the kind of context of that conversation was

AI Reshaping Engineering Teams

People sort of wrestling with uh how teams are changing, how workflows are changing, and how that impacts productivity was like a big part of the conversation. Everybody was sort of representing these perspectives of how engineering is shifting as people are trying to become more AI native. And there was just a lot of sort of churn of what to do and what does it look like to do all these things and

I know this is an area where you've spent a considerable amount of time thinking and building and experimenting and have learned a lot along the way. Our goal is to deconstruct all of your perspectives around teams and organizations, workflows, and how people on RTV can help implement that. So I kind of want to start with uh how AI is reshaping teams and organizations for engineering. Maybe we can sort of talk about generally.

team composition, maybe where it started and how you're thinking about that right now. And I've got maybe a lot of ways we can kind of deconstruct that, but let's just start with team composition. And maybe I'll take in a slightly different I'll start by answering what has not changed and then we can get into what what has changed. If I think about how engineers spend their time, engineers at the end of the day are builders and we should probably talk about it later.

Engineers don't have to be the only functions that are builders. AI is changing where everyone gets to become a builder. But an engineer at the end of the day, their job is to really look at a problem, understand that a problem figure out or break that problem into sub problems and then oftentimes write code to solve that problem. And then after they write the code, they test the code and all of that. Once it's ready, they put it in production and in production they're

getting signals and learnings and all of that and then they're iterating and that cycle just continues to repeat itself. And the idea here is that they're building and they're maximizing value for the user. So if you think about it, there's two different loops that are that are here. There's a inner loop of

engineers writing code and all of that and and shipping code. And then there's the outer loop of like once you ship the code, how do you get the signals? How do you get the learnings and how do you then maximize user value? So the biggest chain thing well the thing that has not changed is that there's still this idea of an inner loop and an outer loop. But then the biggest thing that is changing is that that inner loop is just getting tighter and tighter and tighter and AI is driving it

Mission-Driven Team Structures

extreme amount of efficiency on in that inner loop. And I would argue that it is collapsing the cost of generating code. to to be the cost of generating tokens and and that cost is also just getting closer and closer to zero. So that's That's that's the big framework here that I have in mind of this inner loop and outer loop and the idea that the inner loop is the one where we're experiencing a crazy amount of efficiencies and productivity gains for for users or for engineers.

So when you start to think about like the dynamics there then how do you start to approach Forming teams. You know, for example, there's like so many different product surfaces at OpenAI and like specifically within ChatGPT. Like how do you even start to begin to orient teams around being AI native or across all these different product surfaces? Like what is what does that start to look like?

So we've iterated a lot at OpenAI on what is the best way for ourselves for us to organize ourselves and I would say let's say two years ago or so were very much a research lab that was just learning how to build and ship product. And it's ironic that I say that because even at that time, ChatGPT was the fastest growing product in the history of all products and the fastest growing business.

But but as a company we have always been very much a research company that happens to have a very successful product. So back then a lot of our take was that we would have these amazing researchers that continue to iterate on models and they improve models and then at some point they would

Different types of breakthroughs. And once they have those breakthroughs we would uh we would then start to collaborate with the engineering team on like here's this amazing model breakthrough and find a way to bring it to the user and to expose that that value to the user, as we've all learned in, in, in in our careers. That anytime you throw things over the wall, things just don't go well. So the big learning for us was that uh we need to get all these.

functions and all these people working much more closely together. So since then we are orienting many, if not all, of our teams around this idea of end-to-end teams. And in these end-to-end teams, Uh traditionally you would have your engineers and data scientists and designers and PMs, but then we also throw in one more function to the mix, which is researchers. And all of them are very much aligned behind specific ideas or or specific uh user problems.

Vertical Versus Horizontal Teams

or metrics or or any of that. And then they're iterating tool together in pursuit of that model breakthrough as well as a product breakthrough. And that leads to the best possible outcomes for us. Uh, a really good example of it is is searched, that when we decided to take on the the problem of ChatGPT needs to have access to as much real time information as possible, uh we we effectively got a group of

probably half a dozen or so people across different functions just gave him that problem space and then gave him uh literally a month or two to really think about how to solve that problem and iterate towards solving that problem. And that's where chat GPT search came from. I wanna dive into like how how some of those teams

in these conversations with folks in our team, like this cross-functional relationship is becoming so much more seamless and smooth and you can incorporate so many different collaborators. And then I think the it like to link together another part of OpenAI is like much more horizontal engineering Teen culture.

sort of having the people closest to the code be involved end to end. And so I think those are dynamics that folks are starting to feel that trend and the push in that direction. But there's a sort of gap of like,

How do I get there and how did those teams work together in sort of that way? And so I was wondering if you kind of talk to us a little bit more about, you know, when you have research scientists, infrastructure engineers, product engineers, designers and PMs around one of those initiatives. How does that all work together to then come together with search? Yeah. One thing that I would say in terms of org design is that we're also very explicit about what is a horizontal and what is a vertical.

So a lot of product teams are very much vertical teams, but then we also have horizontal teams. So a really good example of it is that when we launched our multimodal four row, which was our first mul natively multimodal model.

and we had to launch uh advanced voice. Advanced voice relied on some pretty deep voice and RTC infrastructure. And at that time we had absolutely no expertise there. So we did have a vertical team, which is a product team that was responsible for launching Advanced Voice.

Fluid Org Charts, Blurring Roles

and did a fairly reasonable job. But then very quickly we realized that if OpenAI is going to take multimodal very seriously, we need a horizontal team. So we very explicitly started a team that was solely responsible for creating the infrastructure needed for real time communication. Not just for that team, but for for all the teams at at OpenEI as well. So the first thing that I would say is that we continue to have horizontal teams, but

But we're very clear about which teams are horizontal and which teams are very vertical. And and at the same time, we have um good alignment of dependencies across those teams. And then on our our vertical teams, what I have noticed is that that Distinction across functions is just starting to blur over time. A very obvious example of it is that.

Uh researchers are not the only ones who can train models now. Engineers who understand user problems really well, they can collect the right samples and they know which hill to climb and they can improve the models for that sake as well. PMs are no longer just writing product specs or or setting the strategy. They're the ones going in and fixing bugs for us.

Designers don't just work in Figma. They're they're the ones who are actually creating prototypes and some of those prototypes end up being part of the production code as well. So I would say to to summarize, we have The idea of horizontal teams and we try to be as proactive as possible to start those horizontal teams and to make sure that they're adequately staffed.

And then we have vertical teams where we have all these functions. And these functions what we see organically emerge is that because AI is collapsing the barrier to entry in terms of what it takes to be an expert in d in a domain or to be productive in a domain. We see all these functions contribute in all sorts of different wa ways versus being siloed to just here's my domain and this is the only area that I'm gonna work on.

It sounds like some of the virtual teams are uh project based. So we have a problem, let's get a team, solve the problem. So what happens after the problem is solved? Are those teams typically kind of the left space is tied to the project? Yeah, yeah. Org structure at OpenAI is extremely fluid. Uh I do an orientation class every week or almost every week where I welcome new uh engineers and one of the things that we talk about is like how are we structured and I I literally outside

Two times out of three, let's say, as I'm going to the size, I'm like, okay, this site is out of date. We show us an org structure here. Things are just like extremely, extremely dynamic here. But at a high level we have the principle of there's areas which are extremely evergreen and we know that we're gonna continue to invest in those areas. Core experience of an app is an example of it where we need to make sure that ChatGPT continues to be the best. Highest quality, highest craft.

chat bot in the world and that is a very evergreen space. the org structure would reflect that there would be a team that is called a core experience team and that is always staff. And then there is areas that we're extremely excited about for one reason or another.

And a good example of it would be about a year and a half to two years ago, we got extremely excited by the idea of personalization, that ChatGPT should be extremely personalized to you, the way that it answers, the personality of it, and the way that it behaves.

And there was no team that was responsible for that. So we started what I would describe as a project-based team. And then that's an example of a team then that graduated towards becoming an evergreen team or a team that we're always invested in. And and there are also examples of areas where we've invested

Maximizing Cross-Functional Collaboration

And we just realize that there's nothing there and then we divest from them. So overall we have a pretty fluid org structure. In many ways our org structure is modeled after what you see in conventional companies. The only thing is that because the cycles are so shortened that that it's almost like there's teens that are always popping up and then there's teams that are we're always divesting from and that's why things just seem extremely fluid.

We're taking a quick break for a special feature on the future of AI-powered incident management with our friends and sponsor X Matter. Mike Bennett, who leads the engineering team at X-Matters, shares why human-driven coordination creates outage risk and how AI-powered orchestration can dramatically accelerate your path from event to resolution. We're the ones that are correlating the alerts across the platform.

We're the ones that have to remember that a similar issue happened six months ago and this is what we did about it. We're the ones that have to figure out this is a symptom in service A, but it has a dependency in service B that we need to know what that dependency is and how that could impact this thing. We decide on who is going to be page based on some informal knowledge.

It's it's not scalable. I mean it th all of that works in a in a very small scale environment. But as as systems grow, as teams grow, people as a primary integration layer is really fragile. So the outage risk is with with multiple people and all of that coordination, it you become slower to find the root cause. The slower you find the root cause, you then don't know what action you need to take.

The DRI Framework for Accountability

to resolve it. The risk there is not knowing immediately what the problem is, so you don't know what the route for that mitigation is. With all of the information that is out there, getting to that fast is the key goal and is the key problem when you've when you're relying on people to do it. When a signal comes into X Matters, the first thing that you can do is based off of that signal, you can then make a call out to the right people.

Rydyn ni'n ymwneud â'r proses yn ymwneud â'r proses sy'n ymwneud â'r proses sy'n ymwneud â'r proses sy'n ymwneud â'r proses sy'n ymwneud â'r proses sy'n ymwneud â'r proses sy'n ymwneud â'r proses sy'n ymwneud â'r proses sy'n ymwneud â'r proses sy'n ymwneud â'r proses sy'n ymwneud â'r proses sy'n ymwneud â'r proses sy'n ymwneud â'r proses sy'n ymwneud â'r proses sy'n ymwneud â'r proses

From there, the incident commander can then use automations that are set up in the incident because it it automatically creates an incident for us. It's linked to the ticket that generated the incident. And from there we can determine, okay, well I've seen I've seen this before because my incident suggestions is saying this looks similar to this incident you had last week.

We've got built-in automations that can do stuff. So within an instant you might have an automation that says automatically restart pods or automatically rollback services. Like I mentioned before, we can also do that as part of a response. to the signal that comes out to say, okay, this has happened, do a rollback and I can just Touch my phone and go back to bed without even getting out of bed.

all of the automation, the flexibility of the tool and all the the things that you can build in along with the data that you've got with the service catalogue, with your on call, with your who's on duties and gets you to get the right people at the right time on the call if you need to get to a point where you're in a conference. X Matters automates the entire incident lifecycle, taking you from initial event to final resolution.

To see how their purpose-built AI slashes your resolution times and gives your team the context to stop disruptions before they start, head to xmatters.com. That's xm-a-t-t-er-s.com. I know the density of talent here is really strong. Uh there are a lot of uh high achievers and given what you described earlier that the boundary between designers and and engineer and product are blurred and they be getting increasingly hard to tell. So how do you try to m maximize the collaboration?

How do you make sure people are kind of collaborating logic and staff? So they're not stepping on each other toes, they're know what you know to be a combo for, etcetera. The one huge learning for me at at OpenAI has been and and it comes from very early days as well is this concept of DRI. Uh we've heard it and I believe Amazon might be the one that coined the term before directly responsible individual.

Managing Chaos and Prioritization

OpenAI is one company where we stay very true to that principle. I I jokingly use the example where I've worked at companies where they they have DRIs. But then when you look at a list of DRIs, it's not just a singular person, it's a list of DRIs. Is it DRI for engineering, DRI for design, DRI for whatever? And I would say that From my perspective, defeats the purpose of having a DRI?

So one thing that we're very big on is this idea of a DRI that for anything and any outcome that we really care about we're extremely care careful about and we're we're very um disciplined about who the DRI is. And it's always that one person who is a DRI and it could be person uh that is generally the closest to that problem space.

They could be from really any function at all. And then being a DRI gives them uh that sense of um I would say both authority as well as that that sense of accountability towards the outcomes that that we're we're responsible for as well. So I said the biggest thing in terms of just improved collaboration and results for us is that we stay very c uh true to the concept of DRIs at at OpenAI.

This was a big question that King puts you over our dinners because I I was paraphrasing this concept to a lot of different CTOs about this idea of a DRI and how that can really drive accountability around different areas. What recommendations would you have to somebody who is trying to drive greater accountability or to create this directly responsible individual structure? How do you make that work? What would be kind of the best way to introduce this?

I think the first thing that comes to mind is being very careful about how you pick the DRI. And I would say from what I've seen, DRI oftentimes is not the most senior person on the team. Oftentimes it is not the person that is from even a predictable function, let's say whether that be engineering or PM or whatever. We're very careful about that DRI being a person who is

the most or or the closest to the problem space that that that person's going to work on and has probably the most uh knowledge uh about that problem space as well. The other thing that I would say is very true about the concept of a DRI for us is that These are generally people who are the most

passionate about that space as well. So there's, you know, people who could be very qualified, but then they are not willing to lead from the front. And yeah, these are the DRIs for us that are really willing to lead from the front. And what that really means that they're the people who really embrace

taking risks and they embrace, you know, setting the team down a certain path and and and then at the same time it would be the idea here as well that they're the ones who are very amenable to feedback and also changing the uh direction as well as needed because

oftentimes you start down a certain path and then you realize it's not it. So they need to be amenable to chain that path. So having uh enough humility there as well. And the last thing I would say is that uh what I love about OpenAI is the fact that we have a culture where there's psychological safety as well. We probably have a a fair share of

projects that just did not go the way that we wanted. And for those projects there's a huge amount of focus on the right learnings as well as the right positives that came out of it. As well as this idea that we we don't take them to a point of diminishing returns. We we we effectively kill them them uh if and when needed here. So I would say the three ish things that come to mind is one is that picking the right DRI is someone who understands as phase, who is willing to lead from the front.

Prioritization and Decision-Making

But also is is is humble enough as well. And then the whole idea that that you also create a culture of psychological safety as well, where you allow that DRI to truly operate as a DRI and at the same time focus more on learnings as as part of the accountability versus uh being in an environment.

Where people are not willing to raise their hands or to take bold risks as well. I want to ask about coordination because when you when I'm starting to think about horizontal, vertical, and sort of fluidly addressing problems with a lot of different contributors.

representing different disciplines, it seems from the outset, that's a lot of coordination. And that can be an area of friction or an area of a lot of time that a lot of people struggle with. And so I want to get into that. And this may also then bleed us into things like how roles are shifting, how workflows change. But I I what I'm curious about, especially as things accelerate.

the bottleneck is less about, you know, the code that it'll take to build something and it's more about are you picking the right thing? Um, are you addressing risks the right way and are you kind of setting up all this infrastructure? And so when you think about coordination and starting to get all these different teams to inter interact. Like what are some of the practices that help drive good coordination across all of these different structures?

Th there's definitely a few, but I would also just uh just to give people a bit of a glimpse to the culture at OpenAI. Some people describe it as a chaos factory and it's not necessarily a bad thing. That there is there's a lot of chaos, but then there's this

order and these amazing things that come out of chaos. And I think that's part of the magic of of of OpenAI as well. And when it comes to coordination, uh one principle that I would start with here is that OpenAI is a company where we let a lot of different flowers bloom and and we defer

Uh instead of optimizing for premature coordination or prematurely bringing order to a space, a really good example of it is our our foray into codecs of product. That we knew that coding use cases was an one that was extremely important to us. is the one where we could add just so much value to our users and to our developers. I would say depending on how you count, we probably had anywhere from three to five different internal efforts that were just going at this problem space over here.

what worked really well for us at for a codex was that letting these different flowers bloom very independently, giving them the types of resources that they need to succeed, and then at the right time making the call to to converge.

Scaling Decisions and Preventing Bottlenecks

in terms of the direction they want to converge on as well as conversion resources as well. And then going all in on what you see today, which is which is obviously an extremely and highly successful product here here as well.

So that's a bit of a case study that I that that I just keep reminding myself because every once in a while you look around the company and you see just there's so many things happening and you're kinda like, Oh, that's an anti pattern. But then I would say You know, is something that that works really uh well for us as well.

And then I would say the second thing that we spend a lot of our uh time on is is just aligning towards what problem are we solving and being very, very clear on who is solving that problem. And oftentimes what we realize is that

You think that there is miscoordination because everyone's trying to solve the same problem in different ways, but oftentimes you just haven't taken the time to to describe that problem and align all what that problem is. And what you really realize is that more often than not it's It's different people that are attacking either different versions of the problem or

slightly different problems and then they just need to coordinate with each other at the right time on how it all comes together as a as as as one larger piece over here. So that's that open AI, that that we really thrive in chaos and and We're extremely high trust environment as well where people don't consider these to be competitive to each other. They're willing to or they have that humility to learn from each other as well and their explorations.

A lot of my UCR so far is kind of centered around a problem. The problem leads to a worked out team and maybe a hard-done team leader out. How do you throw out like decide what problem to solve? I'm sure there's a lot of ideas So as a leader in terms of proposition, I'm ridiculous how that works. Open AI is a very mission driven company and and we all even on orientation we talk about it quite a bit. The mission is to bring AGI for the benefit of humanity in a safe manner.

So one thing that makes all of this very easy is is this alignment behind the mission and the mission is extremely easy to explain here as well. And I would say the second step here is that We do spend some amount of time on on planning yearly, quarterly, even monthly type basis, just because if these these loops of development are just tightening more and more and more. Part of the planning is to really have

Knowing When to Quit Projects

some statements around here's success, here's what success looks like for us on this time horizon. And the further out you go, the more imprecise it gets or the more um higher level that it gets.

And then from there there's the idea of like, okay, so here's what success for us looks like in this time horizon. Here's all the ways that we're gonna measure that. And and we do a fairly good job at disseminating that to the entire company and just just to make sure that that everyone knows where we're going. And then from there I would say.

the next level of detail is where things start to get a little bit more chaotic and chaotic very much by design, that different groups and different people really look at that and and and think about what is the best way for for us to get there. That's how you know or that's how that's where I would say maybe eighty percent of our bets are very much aligned behind that success statement and the metrics and all that that that uh we're chasing over a year.

And then there's always the rest of it, which is that it's a highly um dynamic industry right now. There's things that are always happening and there's opportunities that are always emerging, and we're also just always learning as well. And then there's definitely enough really for us. us to to pivot and culturally it's one of those where we're not beholden to our plans uh if a better plan emerges and we very quickly sap to it as well.

Managers as Systems Designers

Two part follow-up question around this that I think is there's sort of two problems here. So part of this is a question around decision making and decision rights around this like fast-paced shift. The other side of this is one of the things that engineering leaders in our community have

sort experience or talking about like the ones that are pushing the boundary of more AI needed development is now all of a sudden their feature portfolio or the thousand flowers that are blooming in a much higher volume of of different things. So there's How do you the coordination cost of how do you stay on top of this expanded portfolio of paths to go down, opportunities to pursue? And then the decision making in a really rapid, high-paced environment.

How do you drive the decision making in a way that like that doesn't become a bottleneck? So how do you deal with the high volume and then how do you drive the decisions? In terms of how do you keep up with the high volume here? It's tough.

It's definitely d uh difficult. There's there's a lot going on. And I would say one another thing that is happening as well as the span of control is getting wider and wider. So there's just so many things that need to be in your working memory on one end. But then at the same time, I have seen a huge huge amount of benefit come from the AI native or AI tooling as well that that helps you stay up to speed on those things.

A really good example of it is that let's just say that there's bet to A that that a team is executing on and you really want to understand it at a deeper level.

more than likely you're going to either look at some dashboard or something and you're gonna just see how that vet is doing. Or if you wanna go even deeper, you probably need to talk to some sort of a data scientist or someone and and and you need to ask the right questions and get the right analysis in place to see how things are going. Well, I mean at OpenAI we built a tool that helps you write really good SQL for you to answer any sorts of questions that you want in in in national language.

And now all of a sudden what you have is that every engineer and every leader at the company has their own personal data scientists to collaborate with to d answer the types of questions. So part of that would be that there's enough examples of this AI tooling that really helps you stay up to speed and close to a lot more of what's going on and and and that example replicates itself.

in in pretty much all the different domains in terms of where engineering users are spending their time. And then the second thing here would be something that that is somewhat aspirational, but we're trying to get better and better, which is the idea of

When do you just call it quits and when do you how do you consider something done? Uh, because there's always this tendency of fifteen different bets that have all made it into the product. They all have some level of product market fit, but neither here nor there.

And then how do you make the difficult decisions on on on how to unship things? And this is something that I uh learned at Instagram where where Instagram at some point realized that the app was just getting more and more heavy because there's just so many features that are going in.

Each feature has a very good reason for being there and there's some even metrics that are associated with it, but in aggregate though, and over time the app just feels very bloated. And they instituted this thing that they call unfit. And just like you you celebrate shipping something, you also sh celebrate unshipping things and and and and that's something that I really resonated for me.

And that's something that that we are starting to get much better at over here as well, which is that agents is a really good example of it, that there's so many different products that are that are aiming to become agentec products and they all have a slightly different take on things. And then consolidating that into a single

agentic product that you're you're going after and then being able to just unship things that you think don't uh make sense in that future world. So part of that is just that discipline as well as culture of just celebrating not just things that are ship but also

The Barbell Talent Strategy

those things that that you have unshipped as well for the sake of uh simplicity as well. So I'm shifting into how roles are reshaping here. A line of inquiry you and I are talking about of the role of a manager or a leader is shifting. Managers are becoming system designers versus people logistics.

I was wondering if you'd talk a little bit more about observations you have on how that role is changing and some of the maybe other systems that are becoming beneficial for engineering leaders to help deliver in the function. I would say one thing that I would start with is that the role of engineering management at OpenAI has changed significantly, where at any other company of the level of success that we're at, engineering management role

probably would have turned into one where your most of your job is org management and people management and all of that. But then with with with the advent and with with with all the benefits of AI tooling and we find that our managers just have more time in the day to do more things. And then we try to determine where where those managers spend their time. And we generally encourage and we find that our best managers are the ones who are the closest to the problem spaces.

And and that's where managers are spending a lot of their time. So a really good example of it here would be that most of our managers at this point who are getting hired always end up starting as ICs at the company. They start by writing code, familiarizing themselves with not just our own AI tooling, but also the code base itself. And then building much more intuition around their product areas and engineering areas that they're they're responsible for.

Some of those managers end up actually even remaining ICs where it's like, you know, the best way for me to drive impact is companies by by being that close to code. And others obviously, you know, end up being uh people uh leaders as well. But I would say the biggest

thing that we encourage managers to do here is to uh stay in that habit of continuing to build. So with as mentioned, the the inner loop of developing software being just as tight as what it is, what I find is that most managers have enough automations through codecs and enough internal tools that they have developed that actually just streamline or or or that create the types of systems that you're referring to. So a really good example of is that we just went through performance reviews.

Even though the human judgment was always there in terms of performance feedback and career trajectories and all those conversations, most managers

Mentorship in the AI Age

leverage AI tooling considerably to go and synthesize a lot of different feedback and to pick up on heuristics that they need to pick up on and all of those things. And much of that tooling was tooling that they almost created ephemerally. you know, on their own uh dev boxes. So that's an uh tha that's an example of it. I mean other examples of it is

managers have enough automation set up. Like just over this weekend I set up automation on Codex where every morning at at nine AM or so, Codex just goes and updates all the repos that it need to have access to. And then, you know, by five PM if Codex realizes that I haven't really contributed any code to the code base.

Codex would go and it would it would literally just find some sort of a recently uh reported issue and it would it would actually literally just generate a fix for it and then just push it out for me to review and push it out into the code base. So I find that EI tooling is just giving managers so much more time in the day and they're spending that

time just being much closer to the to the code bases and also creating these systems or these almost like ephemeral tooling that helps them just do more more in that day. And then the last thing here is that there is definitely a strong emphasis internally of just finding leverage through these tools.

So some of the best internal tools that we have are tools that people develop internally for themselves, but then there's different ways for them to uh share those tools and skills and automations with others as well to to find uh broader benefit as well. A typical part of a re as a manager their responsibility being mentor and coaching and um help people to grow their potential and ability.

How does that apply in the world of AI native teams? Because a lot of things you talk about so far is project innovation and moving fast. Uh but there's still a need for the classic, you know, leadership and coaching. So how does that play out? One thing that I find really fascinating in AI native Teams is the fact that mentorship is very two-way over here. I'll start with a very you know, mentorship from from let's say more junior and early career people to more senior people.

What we're finding is that the early career people are a lot more AI native and a lot more AI forward. They're almost, I would say, very unapologetic about use of AI as well. And there's a bunch of different examples that we have of that as well. So there's quite a bit of mentorship that goes upward or or towards the more senior engineers of like here's a very interesting take on this problem or how to solve that problem. We've also seen those people really embrace this idea of

Almost abundance, this mentality of abundance that tokens are just unlimited. You could just use them on really anything that you want. You could solve as many problems as you want. And that's something that I feel there's there's quite a bit of mentorship from from junior to senior engineers.

But then on the flip side over here, as much as the inner loop is very much tightened for all engineers and especially junior engineers, so much of the engineering job and especially the senior engineering job is about just having really good judgment, really understanding the user problems, which ones to solve and which ones not to solve. How do you find the real data that you need to solve those types of problems?

So there's a huge amount of mentorship that's happening from from senior engineers to junior engineers on just now that all of your capacity's freed up.

Technical Intuition for Leaders

That you're not really worried about the syntax of the code and the nitty-gritty details of what that code looks like. Like how do you really scale your impact? Um so so I would say mentorship we found is is going very much in both directions. That there's a lot of judgment and user problem solving and those types of mentorships that's happening from senior to junior engineers.

But then there's also mentorship from juniors to senior in terms of just different ways of just getting more and more impact and value from from codecs and other types of AI tooling that we have here.

I was talking to folks in our community about the best managers or the ones that are going to be closest to the problem space. And so I was wondering if we could dive into maybe some examples, but the reason why we talk uh wh why is because we were talking about the role of judgment and the value the increased value of judgment as

some of these things evolve as teens shift to be more AI native, cultivating that type of judgment and training that judgment is going to become more and more important. So I'm looking at this dynamic of closest to the problem space and cultivate judgment. Can you share maybe like an example of like an engineering manager

close to the problem space or how people are cult like getting close to the problem space or what what does that look like to get close to the problem space and exercise that that good effective judgment. One very recent example that comes to mind is um an engineering manager who is working on our ecosystem team. You know, we all know that engineer is just

love their patterns that they're used to and they they love sort of that one specific way, especially when they're building platforms for other engineers. That's something that you know, you could geek out over for for quite some time and and and that manager noticed that there was a internal debate that was happening within within his team on on two different ways of solving that one problem. And that debate was

probably not really just taking slightly longer than they should. And that manages literally you know, sends a slack message and the slack message very much as I we were discussing option A versus option B. Here's this thing that I vibe coded, you know, on the left, which is option A. And then here's the one that is on the right, which is option B. And as you can see, like there's these performance characteristics and all that.

you know, in A that are not found in B. And and that is something that, you know, right away, you know, all the debate just ended right there. That is something that I could have probably never imagined would have happened, you know, even three or four or five months ago. Like three or four or five months ago, A like that manager would have not had

that much time to stay that close to code. And then for that person to build two different prototypes, like working prototypes of two different options is something that that I would say writing code would have probably been the most expensive way of breaking that debate. But then now it's the cheapest way of of breaking that debate. And that's what that person did and just totally just broke that debate.

I'm finding more and more of our managers where the barrier to entry for you to write code and to contribute meaningfully is so low that there's just spending tokens on on staying as close to code as possible. I mean another Very similar example is that on our pulse product, which is our proactive product on Chat GPT, there was some pretty interesting gnarly behavior that the team was really trying to troubleshoot and they were trying to reason about

why that behavior is is the way that it is. And one of the managers on the team literally just, you know, prompted codec of like, hey, here's what we're noticing and here's this weird bug that our users are running into. what could it be? And Codex literally took his time, but then came back with like, okay, you know, your bug could be XYZ. And lo and behold, as the team tried it out, like Codex really pinpointed that bug.

So I feel like it's it's it's one of those where the barrier to entry and the barrier to contribute on a technical basis is just so low right now that many of our managers are not finding themselves just on a day to day dabbling with the code in a way that they never could before. The this judgment topic I think is so important. There is this fear that more junior engineers will have less opportunity to cultivate that judgment and may defer their cognitive engagement with problems.

Cultivating Judgment: Problem Framing

to AI tools and miss out maybe on understanding the underlying systems or understanding the problems. But what you're describing with sort of this junior-senior pairing and mentorship, it's like it's helping export and helping facilitated, pass on some of that that judgment. And so I think like the broad question is like, how do you cultivate judgment or how do you optimize for this exchange of like maybe more AI native workflows and then also like engaging with problems?

understanding judgment and how to make the right what to solve versus not and find like the right data and all in like some of those practices. Like what does it look like to optimize for that exchange? So that then more junior developers gain those skills and more senior developers pass those on. I feel like the system of growth here probably is not very different than the systems of growth that I've seen in the past as well.

you know, even on past projects, and is very much the case over here as well, that you always wanna pair a very senior, let's say, engineer. with a few junior engineers and and and and back in the day, like it was very much about the the idea that, you know, junior engineers are going to be probably churning out a lot of code and the senior engineers are going to be the ones that are going to

guide the effort and at the same time there'll be enough mentorship here where these junior engineers would over time be those senior engineers and then have a multiplicative impact here. And I would say Similar things are happening now, except for things that were delegated to let's say the junior engineers are just a nitty gritty of writing code is very much done by by AI at this point.

And and all that leftover time w which there's a lot of is just spent on just having the deep conversations and these deep mentorship sessions of You know, okay, so the code was the easy part. What do we do next? What did we learn from it? You know, wh how do we draw the types of outcomes uh that we want from it. So that's the first part of it. And the second part of it would be that we do need to spend and make a lot of effort towards having the right

tooling in place and the right guardrails and systems in place in order for us to move really fast. So the one thing that did not come up in this conversation yet is that

Rapid Fire Questions

As we talk about accelerating that inner loop and that inner loop is accelerating so fast, inner loop accelerating really fast without the outer loop accelerating is just leads to a huge amount of chaos. So a really good example of it for us is that our engineers are probably up to an order of magnitude more productive than they used to be before, which is that if you measure it by just

amount of code that is being written at at OpenAI. Not even written, amount of code that is being submitted uh at OpenAI, like checked into the different repositories. But then when you have that much code that is being written, your bottleneck starts to become your CI, your C D, your ability to safely ship code. And and this is where I would say we have we're continuing to have to invest in systems.

that allow us to take full benefit of of these advents or these these innovations or these productivity gains that are coming from AI. So as an example, there's a huge amount of focus now on, you know, how do you write the right type of test to to really, you know, be able to move as fast as you want.

Or if you have a lot of deploys going out, like how does AI or whatever determine the right tests to run for each deploy so that deploys can be done in a timely manner? Or when it comes to user outcomes, how do you make sure that you get the signal user outcomes as quickly as possible?

So I would say, you know, there's the mentorship part and then there's also the system level thinking here of how do you train engineers to spend more and more of that time on on on guardrails and being able to scale the outer loop as well, which which I would say has not really yet scaled as fast as the inner loop here. What are the um bottlenecks you think in the outer loop beside the for example the City?

I mean the one thing that is extremely obvious and I would say the thing that would be amazing once we get to it and and I would say in different use cases we're starting to get closer and closer to it is that the outer loop in my mind should be as tied to the user outcomes and user value as as As possible. So so really, you know, the outer loop today could really look like I wrote all of this code and then there's this bug over here and then

And once that code gets shipped out there, someone goes and validates whether that bug got fixed or not. Like one version of that outer loop, which is extremely wide outer loop over here, is is that AI itself sort of validates that that is the outcome that it uh needed to achieve and then and then and then that is extremely fast. And you could take it a step further here as well.

Where at the end of the day we measure the usage of our product in terms of how many people are using our product and how deeply they're engaging with that product. Well, the the the widest possible outer loop here.

Could be that as code is being generated, that code when it goes to production, let's say the the outer loop consists of just looking at all those success criteria that that you're you're looking for and then going in like automatically iterating on that and then running the next set of experiments for you to to iterate on it and and and to move and drive that user outcomes that you want.

So I would say that's the ultimate win over here, which is how do you make the outer loop as closely tied to the user value and the thing that you're trying to move as possible. And and until then I would say we have different proxies to it and we also have uh a huge amount of like human in the loop here as well that that really keeps in going between inner and the outer loop. I think there's a lot of folks in our community who have a lot of questions about'cause right right now it seems like

Like everybody's like, Yeah, code production is increasing. How much we're shipping is increasing. And I think there's this desire to want to connect to the outer loop and understand the systems and what that looks like in a in a way. And so I think there's a digit gap there. We'll probably have to do a part two to dive into all those different systems. Um Solomon, we've got some rapid fire questions. Yeah. If you're ready to dive in. Okay, perfect.

So first question, what are you reading or listening to right now? One thing that I continue to listen to and I've been doing you for probably over a decade is is 99% invisible. It's a podcast. I think they do one to two episodes a week. It's very fascinating to just understand things at a much more deeper level, like everyday things and describe at a much deeper level.

I I love ninety nine percent invisible. I've only listened to a couple episodes, but the Invisible Cow Tunnels of Chicago is is the story that I have in the back of my head all the time. That's a great recommendation. What is a tool or methodology that's had a big impact on you?

The one that has one of the biggest impacts on me is this idea of just thinking about things in terms of like screams or barbells over here. So you would see me use that framework just in a lot of different places. When it comes to hiring hires

super just very, very junior people on one end and then very very senior people on the other end. When it comes to the types of skill sets, uh hire extreme generalists, which are people who can just do anything and everything, and then hire just extreme domain experts as specialists.

So that philosophy in my mind has has worked out really, really well because that philosophy just really optimizes for variance and variance just generally leads to much better outcomes than than optimizing for anything for the middle year. What's a trend you're seeing or following that's interesting or hasn't hit the mainstream yet? I would say it's Starting to hit mainstream now with what's going on with

open claw and others. But but the trend that I was tracking for some time was the way that people interact with AI will change. And I would say the way that I describe it is that right now the way you interact with AI

Especially on ChatGPT, but everyone else is this idea of like you have a conversation and for every topic you have one conversation. And I I've firmly believe that that this would move to something that that resembles more of this, just like with a human, that you just talk to a human, you don't really keep in your mind this idea like here's a new conversation that you're having with a human. You just talk to a human and go back and forth.

So what how the way it manifests itself now is this idea of just a monothead with the model. And and that's something that I think is going to really change the way that we we think about these chatbots as more like agents. Uh and and it just represents more of a natural interaction with with with AI as well. Yeah, it is really interesting to think about like the default form for a lot of the ways that people have been building platforms has been distinct. Chat windows.

And and to think about what would that be like if it was one continuous thread like that. It's kind of funny that that became like the default or design pattern. If being in the default, I want to talk to my product peer over here who who was

responsible for that decision. That was very much a technical decision that we had to make just based on context windows and just simplicity of the product and what we could do at that time. And and and and it's one of those where you just get stuck with the default or the starting point, not really realizing that there's a better way. Last question. Uh, is there a quote or mantra that you live by or a quote that's been resonating with you right now? This is probably more of a since.

childhood type code and I will butcher it because there's different rewardings of it online but The one that just always resonates for me is this idea of plan your life as if you're gonna live forever and live your life as if you're gonna die tomorrow. And that I feel like there's just a lot of very important gens and lessons that are planned that that are just

embodied in it and and and it's love that I remind myself of that every day. I I personally appreciate that as somebody who's in kind of a big life transition moving back to the Bay Area. That I think kind of hits us at a at a really good time.

Solman, I just want to say thank you for leading us through the teens and organizations and the shifts that you're seeing there, how roles are changing, some of the workflows. Um, on behalf of all of the CTOs that I was talking with over the past two weeks. I think you've really helped us understand some of the shifts and how people can make better effective decisions for for the two organizations.

If you're listening to this and you're wondering, how can I connect with other engineering leaders in my city? Pull up your phone right now and go to elc.community. Click our chapters page. You can see that on the menu on the left. Find your local chapter and click join. We're hosting

Virtual and in-person events all the time, and this is the best way to help you get involved, expand your network in your city, and support your leadership and career growth. So pull up your phone, head to elc.community, join your local chapter, and get involved. A huge thank you to all of our local leaders. who make community happen and thank you for listening to the Engineering Leadership Podcast.

This transcript was generated by Metacast using AI and may contain inaccuracies. Learn more about transcripts.
For the best experience, listen in Metacast app for iOS or Android