Episode 9 - ChatGPT Atlas and the next era of web browsing - podcast episode cover

Episode 9 - ChatGPT Atlas and the next era of web browsing

Nov 13, 20251 hr 14 minEp. 9
--:--
--:--
Download Metacast podcast app
Listen to this episode in Metacast mobile app
Don't just listen to podcasts. Learn from them with transcripts, summaries, and chapters for every episode. Skim, search, and bookmark insights. Learn more

Summary

The episode explores ChatGPT Atlas, an innovative browser from OpenAI designed to leverage AI for a more intuitive web experience. Ben Goodger and Darin Fisher detail how Atlas integrates ChatGPT to act as an agent, automating tasks, personalizing content, and simplifying complex web interactions. They discuss the architectural decisions behind building on Chromium, the future of agentic internet traffic, and the profound impact AI will have on web accessibility and user productivity.

Episode description

How will the internet feel when your browser can actually help do things for you? OpenAI’s Ben Goodger and Darin Fisher, whose past work shaped some of the most popular modern browsers, dive into the making of ChatGPT Atlas. They explore how AI changes what a browser can be, from tabs you can talk to, to agents that take over tedious tasks. Learn more about the decisions they made along the way and what’s coming next.


- 00:00:45 What is Atlas?

- 00:03:34 The state of browsers and AI on the web

- 00:13:55 Under the hood: why browsers are hard (OWL, rendering)

- 00:22:00 Building with AI: Codex, cross-language, Swift on Windows

- 00:33:39 Search in Atlas: one box plus model response

- 00:41:28 Favorite features: scrolling tabs and tab search

- 00:45:23 Side Chat in action: summarize, shop, build forms

- 00:46:59 Real-world wins with Agent (cloud bill, medical results)

- 00:52:45 Why Chromium? Compatibility and extensions

- 01:07:57 Five-year vision: an agentic web and reduced toil

- 01:13:11 Power tips and closing remarks


Learn more about OWL

https://openai.com/index/building-chatgpt-atlas/

Hosted on Acast. See acast.com/privacy for more information.

Transcript

Introducing ChatGPT Atlas

Hello, I'm Andrew Mayne, and welcome to the OpenAI podcast. There have been a lot of exciting releases from OpenAI recently, including GPT 5.1, Sora, and one of my favorite new applications, ChatGPT Atlas. Today, we're going to... to be talking to the team behind it, Ben Goodger and Darren Fisher, and explore some of the reasons for why OpenAI decided to make a browser, what the future of Agenda capabilities mean, and where everything's headed next.

Time is right because it's actually how people should be starting their journey. We're moving to a world where you can just tell the computer what you want. So I think it's kind of powerful, this idea that the agent has its own workspace. My view for this has always been that this is like a long-term investment. Let's begin with what is Atlas and why? So Atlas is a new kind of browser for an era of the web where people are.

interacting with new technology in natural language. And so it's the kind of browser where you can just tell it what you want, whether it's to find the next outfit that you're going to buy or to help you solve a really hard problem. And then it can help you harness the web. to get a bunch of stuff done. And so central to this idea is that if we take ChatGPT and make it the heart of your browser, not just an add-on,

It's something that can actually help you make sense of the content that you're seeing on the web. It's something that can help you take action on the web. It's something that can learn from your browsing to personalize your experience and help you with tasks that... aren't just done in a few minutes, but might take days or weeks or months or just generally help you become a more curious, more effective person. And it can help you come back to a task that maybe you've...

Why now? I think the progression of... technology with these AI models has been really stunning to watch over the past couple of years. It feels like we're at this sweet spot where the capabilities of not just the LLMs that

Evolution of Browsers with AI

have powered ChatGPT, but also sort of this new area of computer use and some of the other surrounding technology is at a point where we can build some really compelling experiences for people. So we wanted to give it a shot.

you know like ben said the models have gotten so much better and they continue to get better and you see the slope of innovation there and the pace of improvement if you look back at the beginning of the year when operator first came out for example and it sort of like hints at some of the

potential and now you fast forward to where atlas is with agent agent and how much faster it is how much more capable it is just look at that slope and you start to project what about what's it going to look like next year five years out etc And to get that foundation in place, that's what we were excited about. And it felt like the right time.

For me personally, you know, I felt like I had made that transition to seeing how ChatGPT makes so much sense in my life and how much I was using it and feeling like I'm putting ChatGPT to the core of a browser, not just another tab that you have to. go-to, but to have it be at the core and part of that flow, time is right because it's actually how people should be starting their journey. And so we want to make that just so...

natural and easy. And so I'm really excited that we've been able to bring Atlas up and I'm excited to bring it to more platforms. You two have quite a bit of history working in browsers. Netscape, you've worked on Firefox and Chrome. And I'd like to kind of understand...

Where do you think we are right now with browsers? We've got like over a 30-year history of these, and now it seems like they seem like they've been changing a little bit incrementally, and then all of a sudden now we're adding AI to it, et cetera. And how do you look at the browser landscape? Well, I think we have entered this really exciting time on the web where we've added

this very human form of interaction in the form of these large language models that you can just speak to the software and have it do the right thing for you. And so I think that's really going to transform the way people get stuff done online. We've gone from a world where you've had to remember website addresses, you've had to go and search for them, and now you're just going to be able to ask for the tasks that you want to get done and you're going to see it.

get done. I've noticed that a lot of people were thinking like, well, we still have browsers. That was a question. Are we going to have browsers? And it seems like browsers are going to be here to stay for a while. Is that something you both feel? It's a tool that people reach for a lot. You get a lot of things done on the web and using a browser. It's hard to imagine that not being a big part of how people use their computers.

We've been through many phases of the internet and many phases of browser development. There was a time when, here comes mobile, why would anybody use their laptop anymore? Why would anybody use a desktop computer? People continue to use desktop computers. They reach for it for very different things, maybe.

And now that they can also reach for their phone for certain things. But the web browser continues to be such an important tool on your computer for how you get work done, how you do research tasks, how you...

look for information and do that kind of work. Substantive work happens within the browser, happens on the web. Don't really see that changing. If anything, I see that growing because it's just like... conduit to all the world's information it's such an easy platform for people to bring experiences and make it available to everybody the browser just makes that so easy there's something sort of um interesting about it where like every

of all of the technological advancements that we've had over the past 25, 30 years with the web, There's something very durable about the browser. Even if you look at this most recent wave of generative AI, with ChatGPT launching, it launched on the web.

I think very powerful reflection of the capabilities of this platform. And the platform itself is amazing. I don't really need to recount all of the ways it's amazing, but the fact that it's this very inherently open platform, content is published to an open... platform, an open internet where anybody can stand up a browser to consume that content. There are really no gatekeepers when it comes to the web, which is a really remarkable aspect to it. So people can freely publish information.

People can freely go and find that information. And so it allows it to just blossom and grow and evolve in different ways. And then it's very natural that you might want to take something like an LLM and point at it because now it can, on your behalf, try to understand it and help you.

navigate it. So the idea that it can do all that means it just makes it so much easier when you're trying to take advantage of all that information that's out there as a user. Here it is. It can go and find it for you or understand it for you or explain it to you. I just want to say, I think it's been really interesting to see the evolution of the web. We got started in an era where it was coming off of the dot-com boom, right?

the the needs of the browser back then were different the kinds of things people were doing on the internet were different they were totally exploring and then over the course of time as we've worked on browsers the kinds of things people tried to do in the browser was just so much more if you go back to like

AI Makes Computing More Accessible

early 2000s, you had the evolution of these more advanced web applications. I remember marveling at Google Maps when it first launched. The fact that you could just scroll and pan through a map so effortlessly. And then it goes on from there, you know, all the different kinds of web apps that people take for granted, things like YouTube or, you know, I'm going to list a lot of Google apps because I worked at Google, but, you know, Gmail, Google Docs, all these things, the kinds of things that.

I mean, I can do all that in my browser. It's kind of amazing. It's become sort of like this operating system for your life on your laptop kind of deal. Yeah, and it pushes the... what the browser needs to be able to do, right? And so this era that we developed Chrome in was an era where people were already pushing the boundaries of what...

you could do inside of a browser. But they had good reason to do it because the web being this platform where it's so easy to put new experiences onto the internet meant there was a lot of motivation to do that. And you want to, you know, developers are being very creative in how they could push the bounds of what the browser could do. But fast forward to today and it's like you have all that. That's this foundation. And now just think about like.

The world's gotten more complicated. I think there's a lot of opportunity for it to feel overwhelming, the amount of complexity for people. Even going back a few years, I remember when I was a kid and my school friends and I, we'd trade shareware with each other on floppy disks, as you did back at that point. And my mind is just not...

meant to retain certain types of information. And so all of the things like the command line for how to run your unzip tool or however that worked, this was just something that felt entirely bizarre to me. And so in that sense, the web was... kind of a really refreshing take on that it was something where i could just go and click on things and explore without having to sort of understand the underlying nature of the machine and source that's that's true um but uh what i um

What I find with where the future of technology is going is that when we have these AI assistants that are attached... to your computer, I think we'll find that we make that computing capability much more accessible to more people who aren't necessarily experts.

Maybe not just even in terms of how an operating system works or how a browser works, but even how individual websites work. You can express yourself more naturally as to what your intent is, what your goal is, and then the system can kind of figure out how... best to accomplish that for you. It seems like there was a lot of ideas about what the web could be. And part of it is that

A lot of standards were things that were sort of decided after the fact, or we had to sort of go here and do that. And then there was things that would have been nice to have, like the semantic web. We made sure that everything was sort of annotated and it did that, but just in the real world, the corporate world, it's often hard to do that. And now, did you think you'd see a world where, hey, we could have LLM sort of understand this?

Well, I think it's amazing to see. This is kind of the magic of these modern AI models, is that they are really able to interact with things the way we interact with them. So systems... Of course, the world is designed for people with eyeballs and online people with mice to click on things.

fingers to tap on things and so you talk about the semantic web of course it would be really nice if people would publish websites that were more you know inherently understandable by machines but their motivation is to go where the users are and as much as we we make a push for websites that are more accessible to screen readers and whatnot. The reality, of course, is that that's just not where developers...

spend their energy first and foremost, right? And it's usually an afterthought to make sure that you make everything extremely accessible. So kind of the beauty of these AI models is that they kind of meet the technology where users are.

You know, technology is designed for people to consume. So you take a look at the way that it interacts with language and understands language. It can interact with language the way we interact with language. Self-driving cars, they can interact with the roads and the...

and the systems of transit the way that we interact with them. And I think that's kind of the beauty of these AI models is that they can be developed for this world that was designed for humans. And so that extends naturally to the browser. I think it's not necessarily that we predicted exactly 100% how all of this would play out, but there's some very powerful ideas in that original internet. There was this idea that it was...

structured data that would be read and interpreted by a machine and then presented to the user in some way. With the original web, there was this idea of a user's agent, a user agent, and that's a browser. machine-readable content, and then apply some presentation preferences to it. Maybe you like your font to be a little bit bigger so it's easier to read. Maybe you like it a certain style or weight or other stuff like that. And it doesn't. That was back in the original web was that idea.

And so I think that that carries forward in today's era actually very, very well. And you can view where we are today as just sort of the natural endpoint or a continuation of that journey. Evolution of the user agent, right? Back in the day, even in Netscape browser, you could write what was called a custom style sheet or a user agent style sheet. You could override the colors of any website, and people maybe who were more sophisticated would know how to do that.

Later on, browser extensions kind of made that a more universally available kind of thing. People could write an extension, share it with other people more easily. But it just makes a lot of sense now to empower an LLM to be able to go and on your behalf, as Ben said.

Architectural Design of Atlas

to really supercharge that user agent. to be able to do more things on your behalf. Yeah, there were a lot of interesting ideas, I think, going back at the dawn of all of this. And you look at even the names of some of the tools, like Gopher and Watson and Sherlock and whatnot, and how it was kind of more proactive, the idea that, as you mentioned, these are sort of tools that don't...

just aren't document viewers. And I think that we kind of take the browser for kind of granted and the idea that it just sort of like, it just shows me the website and I can see where it's helpful for you.

You two probably haven't spent most of your careers working in the space of browsers and trying to understand that. And it seems like there's an inordinate amount of complexity there that's invisible. Could you give me an example of the kinds of things you have to deal with when you're trying to figure out how to make something?

Work? Man, browsers are maybe surprisingly complex. I don't even know where to begin. I'm amazed at how much work there is and how much technology goes into building a browser. It's basically like an app platform or a mini operating system that's running on your desktop. So every discipline of computing it feels like you can nerd out on some aspect of the browser.

I was just having a conversation over lunch with one of the engineers on our team explaining how OWL works, which is our embedding of Chromium that runs out of process. I was explaining the rendering model for this, and it kind of led to a conversation about how Chrome first worked when we first built it and then how the advent of GPU accelerated rendering evolved. And now fast forward to the way it all works today and what we're trying to do with OWL.

And just sort of the depth of complexities there. And I could go on and on, actually. Well, it is interesting, because I think that people kind of overlook, like, kind of the browser wars led to technologies like... running node on servers to actually do stuff, which is a thing nobody even thought would happen, but because just trying to make these things much more efficient and faster. And I, again, I kind of, you know, marvel that anything works at all.

Looking at some of the technical stuff you have released on ChatGPT Atlas and understanding it's not just a plugin. It's not just a thing that adds a ChatGPT sidebar. Could you explain a little bit more about the architecture? So from a design perspective, I would say that We wanted to imagine the entire experience with ChatGPT at the heart of this thing. We have a ChatGPT extension as well that you can install in Chrome, for example, but there's some limits to what that can do.

to just be really empowered to look across the entire browsing surface. And I think maybe 2% to 5% of what we want to do is there today. But we think that this being able to own the whole browser experience in this way gives us the opportunity to weave that chat GPT magic in throughout. And that includes things, for example, like on any... text field, you can invoke ChatGPT and have it help you write. And then as it is helping you write, that is...

Your personalized ChatGPT can sign your email as you because it knows you from your other use of ChatGPT. And so being able to build it in a way that enables these very richly integrated use cases felt... It felt very important to have a browser as opposed to just an add-on for an existing app. And it gets to some of the foundational stuff. When ChatTPT is at the core of this thing and you enable things like the memories feature, it means that...

You can ask this thing, hey, what was that thing I was looking at again? And it's going to know and it's going to help you. Like, who hasn't had this sort of experience of, oh, yeah, I remember I saw some video or I saw something. What was that again? How do I get back to it? I want to share it with somebody else. Traditionally, you might go through your browser history or your YouTube history and scroll through there trying to find it. Or you're like, what was that tweet again that I saw?

I was looking at a recipe with my son on the weekend. What was the third buffalo wings recipe that we found that I wanted to make? How do I find it again? And just to be able to easily recall it, because this thing is able to do that so effortlessly. It's a side effect of it just being there. Probably what I would say is the biggest advantage for doing this, and especially in the way that we've done it, is how we thought about integrating this agent.

And this is really where having our own concept of browser, including what it means to have a collection of tabs. If you think of your regular browser window, that's a collection of tabs. you can also imagine that your agent has a collection of tabs maybe each instance of the agent that you've chosen that you've asked to go off and do something for you you might have like five of them each running on on different problems and each one of those has its own collection of tabs and of course

are not showing up on your top tabs because you didn't open them, but it nonetheless has them and it's working through them and it's getting information from them and it's processing it and taking action on your behalf and clicking on things and all that sort of stuff. That is the sort of functionality.

Agent Mode Capabilities and Control

that you can have when you go and you design a system like this, sort of end-to-end. So you can invent all these abstractions. And you had this in the very, very first version. joined Ben to work on this, and he had already had sort of this idea, this idea that we would...

be able to segment tabs between the tabs that are the user's tabs and that tabs that the agent is working on for your behalf. And that kind of shows up in the product today. When you start an agent task, it goes off and is going to... work on whatever you asked it to work on, and it might need to open some additional tabs. And instead of those tabs just appearing in your tab strip and perhaps feeling a little discombobulating because what are all these tabs?

It has just sort of accumulated some work in the background. And when it's done, then it presents it to you in a tabular form. You can go and click through and see what it actually did. Or you're just happy with the outcome and you really didn't need to see all the intermediate steps. And so I think it's kind of powerful, this idea that the agent has its own workspace.

I would say for many people using this on the surface, it seems pretty easy to understand. I've got a browser, I've got ChatGPT, but we also have agent mode. What is an agent task? What is not? What would be a ChatGPT task? Could you explain that? Agent mode is basically you're inviting ChatGPT to take action on the web on your behalf.

And so maybe you are looking at a website and you would like to do something on that site, but you're not quite sure how to do it. What you can do is you can just ask ChatGPT directly. Let's say you're on a spreadsheet and you want to synthesize a pie chart and you don't know how to do that.

You can just say, hey, make a pie chart with this data. And it will go off and it will figure out how to use that software. Now, if you could think about sometimes some of the software that you use, it can be pretty complicated. So just being able to ask in natural language, in your own words, what you want to see.

And then you could just sit back and watch it take over and it starts moving the mouse around and doing stuff like that for you. It's pretty amazing. And you could see how it's going to do it. So actually, you can learn how to now make that pie chart because it's going to show you.

which is pretty cool. I found it pretty useful. I like to study memory methods and stuff, and I have to do a thing where you have to have a system for decks of cards and stuff, and I didn't want to have to paste in a bunch of card emojis. I'm like, can you just go do this for me? It's like magic. It's like witchcraft. One of the fun things I've seen people do around the office is they'll have written a doc.

And then they'll ask it to take a review pass on their doc and add some comments. And so it'll actually go into your doc and it can use the integrated commenting tool of whatever document editing system you're using. And it will just add comments as if it was a collaborator. Yeah, it's really amazing.

Of course, being software engineers, we experienced the model critiquing our code. And over the past year, just amazing to see how much better it's gotten at that. Like in the beginning, it was not always the case that it would find things that were useful.

But it's these days I'm like, this thing is sharing important, like amazing nuggets that are saving me from, you know, shipping bugs. And I can just, you know, I can see the analogy to like reviewing any document that I might have and just asking it to go and give me some. some style feedback or, you know, maybe some grammar suggestions or like, you know, tone suggestions. And I just, it's very exciting to think about how.

Any tool you're using on the internet, you could invoke this agent and ask it to interact with that in the same way that I would or somebody else would and just see what it would do. And maybe you learn from it or whatnot. One of the things that's come up a lot during my conversations with the teams inside OpenAI is how much they're using the tools. They're using GPT-5, GPT-5 codecs to do that.

How has that affected you, and do you think that's going to create an accelerated product cycle? Absolutely. A couple of anecdotes on this one. One of the top Codex users at the company is on our team. Raw productivity in terms of PR output is off the charts. As a result of using this tool, it's really exciting to see what... Good PR output. tools because they can you can both sort of explore an area help you explore an area decide if something is worth doing

you apply your judgment, you tell it what you want it to do, and it goes off and does it. And then for folks like me, who maybe spend less time coding than I used to, I can also have it go off and prove some stuff out. So everyone on our team is able to contribute. Our product managers...

are producing PRs, our designers are producing PRs because of these tools. And so I'm like a true believer. Before GA, I did this week-long refactoring to try to unlock a certain feature that we were trying to ship.

And then I had one more to do. And this time I asked Codex to do it and it was done within, the whole project was done within an hour and it was of similar scale. And I was able to tell it, hey, just do this other one kind of like I did that one. So I had shown it the way and I just ask it to.

do this other task that was very similar. And it was almost one shot. One of the promises of really capable code tools that are able to write in multiple languages is porting code from one language to another.

We saw with Sora and how, you know, they said, hey, Android's coming. Bringing it to Android, yep. Yeah, and people were like, oh, great, when all that? I'm like, oh, no, it chipped. No, no, I mean, actually, ChatGPT's been amazing at doing cross-language translation for quite a while. I mean, coding languages.

So, you know, we're bringing this product to Windows now, and we're actually going to be using Swift, which because we are a bunch of, the team is full of Swift experts, and we are excited to have a shared common code base. You use Swift in the windows? Yeah, Swift on Windows. And so I'm not, you know, I think it's very, in my past experience, I've been, I was marveling even years ago at just how good.

ChatGPT was at being able to essentially generate a code for me in Swift that did not exist on the internet. So it was taking code that maybe was written for .NET and could translate it to Swift for me. And I was marveling at its capability. or to generate a bunch of obscure WinRT code that's normally a very tedious C code with lots of GUIDs and all kinds of things that are very... but it was all just spitting out this code and saving us just an enormous amount of time.

Trust, Privacy, and Personalization

had Codex CLI spin up a Swift app without ever having to go into Xcode to paste anything in, and it worked just right out of the box. It really impressed me. It's kind of like one of the strengths of these models. So the fact that if you can ask the right question, get the right prompt, and if it's on the right path and how to build something, it can do it in any language. That's really not a problem.

So I've been using this and I've been switching it to agent mode to do stuff. And I know sometimes I can just leave a tab and go off and do something else. And sometimes it's like, hey, if you leave this tab, I'm going to pause. What's going on there? Sometimes you'll have asked the agent to do something that's very sensitive. An example is it's looking at your email and we would like you to keep your eyes on the road.

so to speak. You know, I have a car that has sort of an auto drive feature and it wants me to pay attention to the road. It's helping me drive, but it's not going to let me like check my phone or, you know, take a nap or something like that. And so it has a little camera that's watching my eye.

is making sure that I'm paying attention. So you can kind of view this sensitive mode in agent mode as kind of like that. It wants me to pay attention to this tab while it completes, just so that I feel like I have a good level of control over it. And in fact, if you look at the bottom of the tab,

you'll see that there's a little bar that has a big red stop button in it. And if you've ever been in a machine shop, you'll know that the machines there have these big red buttons on them. If suddenly it starts to do something that you don't want it to do, you just whack that button and it stops. And so that's the idea. Just keep an eye on it, watch it go, and you can always take over if you want to do it yourself. Yeah, we put a lot of thought into making sure that these features...

help you feel in control of the experience and take away some of that maybe uncertainty when you want to go use them. For example, on top of what Ben mentioned, there's also the signed out a method of using agents so if you want to start without it being in an authenticated session meaning it doesn't have the cookies required to even access your email you could do that

And I think that can be a great way to kind of try some things and do it where you're learning. Like, how does this thing actually work? And you might then hit a point where in order to take the next step with agent, well, actually, it would be helpful if it were authenticated. And so then you might try doing a task where it does benefit from having your cookies.

You might not remember, probably the first time you actually run Agent, it shows the screen. It explains how all of this stuff works for you. So if you read through that, you'll see sort of the choices that you have and you'll learn about how to use them.

I have about a half a million unread email. I'm ready to go full auto. It can't do a worse job than I have so far. You might find it just hits select all archive. Yeah, it wouldn't be the worst. Declare bankruptcy. Yeah, I could have somebody to blame. But it has been it's been super helpful for me because there have been times when I'm trying to surface an email and the keywords just don't work or the results are too many. And just being able to go in and say, find this thing about that.

It's saved me numerous times. And so one of the other things that we did with Agent as well, because there are times when you'll want to keep an eye on it, is that we, and I'm pretty proud of what the team was able to come up with here, is to make it very visually compelling. And so there's, you know, all of the little sparkles and pixie dust and so on that appear around it as it's working. It's pretty cool. So yeah, love to see more people try it out.

Yeah, it's fun to watch that. And also with ChatGPT2 and agent mode, I've said that I could probably watch a live stream of just watching these systems solve problems because it's like how it's made, but watching computers do it. It's exciting to see where this is headed. I also sort of wonder, what's it going to be like when there's going to be a lot of different AI-powered browsers out there? And also, we have to think about, what is the ecosystem like when...

Most of my tabs are opened by my agent and not me. And somebody's trying to capture my attention. The way agent works right now is that it's only like running in response to your... If the agent is doing something, it's because at some level you asked for it. Pages that it opens actually have some limitations. You might be used to browsing around the web and you'll see some page show a pop-up window saying, give me notifications permission so that I can spam you with updates.

The agent tabs can't do that. They're actually blocked. So there's a bunch of stuff like the design of the system to avoid you accidentally ending up in that. Now, of course, you're free when you browse to a website and you're asked for a notification permission. If you want to receive updates from that site, you're free as a user to go and say, yes, I want this, but the agent won't ever do that on your behalf.

It's an interesting world where we think about part of the beauty of the early web was the serendipity. Oh, I found this other thing. I found these other links. But I think then that kind of got sort of weaponized against the user where basically you try to do a thing and it's hard.

to do a thing. It's really interesting. I don't know if this is where you're going with that, but a lot of websites want to just keep you on their website. Maybe they'll run ads which would take you off, but otherwise they kind of keep you in that lane.

The side chat or the model being present there and the agent even is just that you can ask questions about that site where the answer might be something on a different site. And so it makes the web bigger for you. It helps you not just be stuck down that. that rabbit hole that you were on, but to help bridge you to something more useful to you maybe, or more helpful. I know you have some good stories on this. There's something just wonderful about the...

I call it beautiful chaos of the web where you kind of don't want to always be stuck in the same place. You want to be able to embrace the diversity of the web and all of the content that's on it. So yeah, I love that. I love that you can do it as well without having to leave the site.

You know, you can do it right there on the side and then you can choose to go somewhere else. But it's sort of, there's this aspect of, I know with Wikipedia, you can go on these like multi-hour journeys through content. That's really only, that's like a feature of Wikipedia. Whereas I feel like.

The Ask Chat GPT sidebar gives you that ability for the web at scale. And so it gives you the ability to ask questions about random sites and then go off in different directions and that sort of thing. I mean, this is extremely useful if you're looking for certain kinds of products and you find yourself onto one product page for...

one company or one, you know, vendor, but now you can be like, well, what else is out there? And the model can say, well, here, there's some other sites to go check out that are related to this. And off you go. Now your, your world has gotten bigger, right?

I've had some wonderful discoveries with both videos and books that I couldn't find through the YouTube search engine or the Amazon search engine. I found places where because chat GPT understood a bit more about what I was looking for, what I was really trying to find.

Atlas Redefines Web Search

And that was like, it gave me more utility out of those sites. That's actually another, for ChatGPT as a whole, the personalization features, the fact that it sort of learns more about you the more you use it. has been a super popular feature of ChatGPT. With Atlas, this extends to your browsing activity, your web history.

And so this allows the browser to create these browser memories, which Darren pointed out before is kind of something you can use to help you get back to a site if you kind of can't remember it later. But it also helps in situations like with the agent. I am a United Mileage Plus member, and so I tend to like to look for flights on that site. It would be very tedious if every time I asked the agent to go and...

do something like that if I had to tell it and always use United Airlines. But it kind of knows from my browsing. my browser memories that I'm a frequent user of United. And so it will just go there. Yeah, it helps you in the forward queries, right? Because it's like now this search experience has so much more context about what matters to you. just it ends up being a lot more efficient right saves you a bunch of time because you don't have to tell it as much again and again

So I feel like that's something that helps me a lot. I think some people probably have different preferences around these things as well. So there are controls where people can go and see and control what memories are used. You can turn it off entirely.

I also have to, when you use it and I don't go into these tabs as much, I'm starting to do that more because I realized that there is I can go to images, I can go to news and I have kind of like a search engine. And that's the thing I'm trying to sort of understand is. It's a browser, but it's also OpenAI heading towards its own search engine.

Well, part of that comes from the fact that when you're building a browser, people come to that browser with existing intents, like navigational intents or the idea that they do want to look for images, right? Or they want to see a certain kind of... subset of information. And so we brought those controls into the landing page of ChatGPT.com so that it would be both familiar to people, but also useful in the way that they're used to, right?

We wanted to make sure people didn't feel like they're so out of, you know, that we didn't want people to have to learn so many new things in order to be successful using this product. We wanted them to have a good dose of familiar. tools and familiar sorts of things. And anyways, these are just useful. For many people, search and browser are just very connected, almost one of the same.

You know, it's very important to internalize that as we're building this experience. I think it's very powerful that, and I was touching on this before, that as people search and use the browser in maybe a very normal way. They're also learning about that there's a model there that's going to respond to them. So you get a set of chips across the top, which is like quick links to go to where you were trying to go, perhaps.

These different tabs where you can click on to see familiar different kinds of subsets of information, but also this model response coming in. And so you start to, using a product in a normal way, you start to learn that there's a... There's another way or that there's a superpower that this thing's providing. And it's some of it is just the normal ChatGPT experience that people are used to. But not everybody's using ChatGPT to the fullness.

And so when it's core and central to the experience, we have an opportunity to present that to people as part of their normal journey. And I think that's really cool. Similar to SideChat. Of course, you have to activate it, but it's right there. Ask chat GPT and you might be curious. And now you kind of unlock the superpower. But it's right there. The context is there. Interesting experience for me was the very first day when I started using it.

I look at this and I'm trying to understand, okay, this is basically, it's an app that has a browser and ChatGPT. It's not like we just sort of glued those things together. It's sort of like they're both there and there's a deep connection to the ChatGPT. And I asked it to... Could you add a bookmark for Amazon? And then a moment later, the bookmark appeared. And that was a really kind of special moment to sort of understand what happens when the LLM deeply understands the system.

I think we're very excited about this, I think from a conceptual transformation. point of view, we're moving to a world where you can just tell the computer what you want in whatever way you want to tell it. Making computing more accessible to more people is just really profound. The company's mission is to make AGI beneficial to all of humanity. We take that really seriously.

Being able to transform computing in ways like this that might seem very small on face of it, they add up to something far more profound. And so, yeah, we're excited about that kind of thing. Some of my first experiences with ChatGPT as a user was really this idea Ben's talking about. You know, I was comfortable, happy just sitting there doing my Google searches, but sometimes I didn't quite know what...

Google query to type in. And when I realized I could ask a really poorly formed question to ChatGPT and it would come back with, make some sense of what I asked, what I said, it would give me something that maybe now I could... query Google for. That's how I first started using ChatGPT. And then I started to realize over time that like, oh, why am I not just asking it in the first place?

You know, and it's sort of, I think for people, they all have like a, there's like a bit of a journey with new technology, right? We're all creatures of habit. We're used to the things, the way we work. And it works well for us, the things we're used to. We're used to it, so it's not a problem. But as you maybe explore something new, you start to see, oh, there actually is a better way.

And for everybody, that journey is a little different. And so for me, one of the things I was most excited about with Atlas was this idea that... when you're typing into the address bar, that the default is ChatGPT, because for me, that actually makes sense for most of what I'm going to do.

Innovative Tab Management: Scrolling Tabs

This is one of these things where I feel like now when I don't have access to that, I feel like there's this little bit of friction. It takes longer now because I've got to go find my chat GPT tab and another browser and figure out how to get at that and do that. Whereas with Atlas, you can just open a new tab and start typing. The old way was a much more manual way. have to be as clever about what I ask. I can just give some...

I have a problem. I have a problem. I can say it in a much more simple way. I know that I'm still having trouble kind of context switching and understanding that it's not just a URL search bar or just an empty keyword search, whatever, that literally I can ask it for things.

Not just have, you know, you know, what is the capital of Nepal and not just have that pop into a Google search box. Right. And that's the thing now. Oh, yeah. When I go type in the thing, if I type in the URL, I get the URL, but I can also type in kind of my query and do that. But that's still.

taking me time to adjust well just as a um like a general rule like i find sometimes modes can be a reflection of some of the limitations of the system underneath that that you know at the end of the day humans don't understand um and so i think the north star for us with so much of this stuff is can we just help

you arrive at the right place, regardless of you needing to know, I should put it in this mode or I should put it in that mode. That is sort of the struggle. It comes down to how do you want to use this tool? And so we want to make this thing something that if you just go in and...

Darren was saying before, you can just kind of tell it what you want, maybe this half-formed thought, and it will give you something good. It will help you figure out the problem. And of course, there are ways that, you know, if you are a...

A user that understands some of the underlying capabilities of the system, we want to give you the option to invoke those, too, to bring them down and help you. And that's sort of an efficiency gain that you can get. But certainly the system shouldn't require that you know all of those.

those sort of incantations. It should be able to just take what you say to it and give you something good. Yeah, I think it reminds me back to the early days of browsers in the era when people would install a toolbar for their search engine.

You know, that meant they had yet another box on their browser, right? And Firefox had a dedicated search box for doing your web search. But back then, as much as people were very used to that and very comfortable with it, you know, you have one box to type URLs. and one box to type search queries. When we were working on Chrome, we were like, why have two boxes? Why do people have to stop and think about which box to type it into? Just give them one box.

Now, if you look at Chrome, that's what its URL bar looks like, right? Just one box, and that's become the industry standard. But even on Chrome's new tab page, there's actually two boxes. There's one for the address bar at the top, and then there's this box in the middle that's the... maybe comfortable familiar google box right what we wanted to do and we kind of pushed ourselves with is like this whole topic of like hey you might have a conversation you want to start with the model you might

be interested in navigation, navigational query, but really you might not make up your mind about what your intent is until you start typing. And just one box is a lot simpler. And so when you open up Atlas, you just have one box on the new tab page. And that was something from a design perspective that we really tried to achieve. And I think we were able to and it keeps the whole system a little bit simpler for people.

It might be a little unusual and not what people are used to, but I think over time they'll get to like it. What was some of your favorite features, some of the things you're glad you're able to implement? Oh, man. It's interesting, whenever you get a chance to build a new browser, having worked on quite a few, you get a chance to sort of start over and reset on certain things.

Not everything, because I think one of the core tensions is that people are used to their browser the way it is. But you do have a chance to rethink some things. So one of the features I worked on was... the scrolling tabs feature it kind of came from an insight of um that uh you know life tab life could be a little better if maybe new tabs all started all were inserted on the left or just on one side

If you're a user who pins tabs in your browser, which is a pretty advanced use case, maybe a lot of people don't know that you can pin tabs to the tab strip in Chrome or Safari or other browsers, but it is a common thing. And if you pin a tab... it'll be pinned on the left side and whenever suppose that was like a gmail tab when you click links those new web pages would open just adjacent to that pin tab if you press the plus button the new tabs would appear off to the right

and what ends up happening is you're working throughout your day you're going to gmail opening a url from there you're hitting the plus button opening buttons tabs on the right and you're sort of accumulating old tabs in the middle

And so it becomes a little bit painful to close all of the tabs to the right from the middle, you know, to clean up those tabs and you just end up with a lot of clutter. So scrolling tabs was one of the... innovations that we worked on to try to make tab management better and it's like it's not an ai feature but it's like when you have this opportunity to rethink browsers

It's an opportunity to rethink some of these primitives and try some different things. It's a major productivity tool, so finding these wins can be really exciting. One of the things that I came to realize and appreciate only later... is that if you have a browser that more naturally scales to having tons and tons of tabs, it means that certain kinds of things get unlocked for you.

everybody's more many people are probably familiar with the ability to search for a specific tab that you might have open there's a command shift a or a button for that in many browsers uh what With scrolling tabs, the fact that it can allow for a lot of tabs to be...

accumulate it without them all being in your face, you can still search across them and find these old tabs. So in a way, it's like this history of things you've done in your browser is there for you to search in a very... familiar way because it's your command shift a it's right there and it you can have that capability without it being cluttered.

I've got like well over a thousand tabs open and I just wouldn't think for that to be possible. Or you might think that that would be a problem, right? But it's not. No. Because the system manages the memory for you. Yeah. Now this is the scrolling tabs feature that is not on by default. And part of the reason why it's not on by default, as much as we think it's magical, and I'm a huge fan of this thing we built.

it is also a little different than what people are used to and we wanted people to not have to learn so many new things all at once when they're approaching this browser that is you know bringing all these ai capabilities but one of the amazing things when you have allowed

The Ask ChatGPT Sidebar: Daily Use

For thousands of tabs to be open means not only do you get to access it again with tab search, but the model can see them. The model can see these tabs. It means your working set can be very large, larger than what you might keep in your head naturally. But you know that there was something there, so you're going to ask the model for it, and it can go and interact with those tabs again. And I think that's actually pretty amazing. I would be remiss if I didn't mention for this question as well.

the basic feature of Atlas, which is the Ask Chat GBT sidebar. This is something I get some value out of every single day as I use the browser. I pull that thing open. I ask it to summarize a page if it's too long or I want to figure out.

Like if I'm reading an article, like how it really... matters to me in particular if i have a question about something that's going on in the world it can go off and do some research for me and come back with stats and facts and figures i've used it when i'm online shopping to make sure i'm looking at what really is the best deal on something that i'm looking at used it to help.

spin up agent tasks to go off and automate some of my productivity workflows. I've had it build Google Forms for me to help me quiz my coworkers on the best way to design new features for the browser. You asked SciChat to help come up with the outline of the survey. And then you said, hey, can you just put it into a Google form for me? And it did it. We talked before about...

bringing the power of ChatGPT with you everywhere you go on the web. And I think that sidebar really, it's like having ChatGPT sitting on your shoulder, just right there to help give you some advice wherever you might need it. And sometimes even just simple things like I was in Slack and there was some, somebody shared some text that was in another language. So I just selected it.

And I right-click and ask SideChat about it, and it translated it for me. And it was so much easier than having to, you don't have to sit there and copy-paste, right? My favorite... use of that so far with the in the agent mode and i won't name the cloud provider but it's a very big company that often you find out you run a lot of services and you forget what those services are and at the end of the month you get a bill

And it's a very confusing bill because you're trying to figure out, I thought I shut this down. Isn't this all of them? Yeah. Well, some of them are a little bit or maybe been around longer and trying to parse through that's like reading a Soviet tractor manual. And I went in.

And I said, hey, I got this bill. I think I should have been going. I don't know what's going on. Can you help me with it? And I watched it navigate through the website, go to the page, find the different things I was doing, explain to me what the service was doing. I'm like, can I shut this down? I'm like, yeah.

shut it down. And that was like a hundred dollar a month bill that was just saved through. Wow. That's really, that's awesome. Another one I had as well, I was actually, I had some, some medical tests done recently. And sometimes it can take a while for the doctor to come back and explain to you what they mean. And in the meantime, you have the patient portal there, and you can access the doctor language stuff there. And I can't read that. It's not written in English, normal English.

And you can ask and it will tell you kind of like what that means for you. And I found that to be really helpful. Yeah. So saving you money, helping you get some answers, this thing feels very... I'm pretty convinced that this is increasingly the way that people will interact with information. Or you're using some very popular...

yet complicated HR tool or something like this. And you're like, where's that thing again? And it of course has studied the manual for you and can go and show you the way. I feel like it took me a long time to realize that once I had, you know, an iPhone that I always had a camera and a flashlight in my pocket. And there are many situations where I'm like, ah, what was the name of this thing I saw in the store? It's like.

You could have taken a photo, you know, or man, it's dark. It's like you've got a flashlight. And I feel with these tools, there's a lot of capability there that we even saw that to a search. You know, some of us were power users of search and other people are was a complete mystery. And.

Do you think we'll see a faster acceleration here that people are going to start sharing and understand how to use this? Yeah, I think the stage that we're in with Atlas right now is we think this is a really powerful tool, but we don't know all the ways in which people will use it.

Internal Productivity with AI Tools

internet in that sense um and so one of the reasons why we wanted to get this out you know when we did is we just want to see how people use it and hear from people you know where it works well where it sucks and needs to get improved you know um but i think over the course of time we'll get a better feel

I also think we'll need to help explain in more cases when the right time to use it could be. Because I think there's a part of building something that feels like magic in making that magic real for more people in more situations. And we don't want to have to rely on people to always think I should ask this question at this time. Yeah, it's really easy to just, we're creatures of habits. We use the browser the way we use the browser. We use our computers the way we use them.

We don't always realize when there's a better way to do something or a more efficient thing that we could be doing. I feel this way about the process of learning how to use ChatGPT in the first place. It's just like realization. Oh, I should just ask the model for that. It'll save me time, but it takes a little while and there's a bit of a tipping point for people.

where they start at some point in their journey, they're going to learn how to use these tools. And there might be some people who are early adopters and they can show the way and figure things out and share those ideas. But also it's kind of like, I think a lot of people haven't yet. found their way to how to use these tools in the best way.

I find myself still trying to shut down tabs because I started using browsers in the era of getting the pop-up message, you have too many browser tabs open. The compute and the capability and the management internally of these things is way advanced. And so I think about like, you know, I'm not optimized in many ways. Also another like...

say, non-AI feature of our browser was that we kind of took a page out of the playbook of mobile browsers, recognizing that, you know, your laptop's computing resources are not really limited. You have a battery you care about. So we put a limit on how many tabs would actually be backed by a live web page.

might be the more traditional approach that desktop browsers would take, which is to just try to mitigate the cost of those background web pages that you haven't used in forever, we will just close them down. And if you go back to that tab... it'll get reopened and we keep a reasonable limit there and we apply, you know, somewhat of a clever-ish caching algorithm to try to be smart about making sure that the tabs you care about are kept in memory.

So as to sort of lessen the burden on your computer, you might notice also with Atlas that it restarts super fast when you restart Atlas because of the way it's structured, the owl. process is separate from the Atlas process, AL being our embedding of Chromium. So the two can start up in parallel. We can restart Atlas very quickly with all your tabs and the data associated with them.

but the web pages aren't loaded yet. And when you click on them, it'll bring those web pages back. But this way, the whole system can stay fast, lightweight. And as we were able to build Atlas as a separate... application from this from owl atlas is controlling owl owl is projecting data into you know the rendering of web pages into atlas but atlas itself can stay a relatively thin swift application

Design Simplicity: One Input Box

Why Chromium? That's a fantastic question. I answered this question on the site formerly known as Twitter by saying that, you know... Web compatibility. So it turns out, unfortunately, or for better or worse, a lot of websites are only really designed to work with Chromium. There are features of major websites, which I won't get into naming names, but that are just not present if you're not using Chromium-based browser. And the other reason is Chromium extensions, extensions built.

on top of chromium are very popular and when you build your browser on top of chromium it means those extensions will just work and so we want to make sure that we're building a browser that first off works for people All the websites they care about will be supported and all the features of those websites will be supported. And we want to make sure that they could install any of the extensions that they care about and that they're used to using in their browser.

Kind of non-technical, people hear Chrome, they hear Chromium, but I understand there's a really deep lineage that even goes further back. You also have WebKit and KDE and whatnot. So what I'd say is, I think there's a lot of excitement. from among the community to see new rendering engines come about. And that's certainly been part of the DNA of the web too over the years. At the same time, just like Darren said,

When you build a new browser and you don't have that many people using your product just yet, you kind of just want the web to work as people know it today. And actually back when we were starting working on Chrome, we had the same concern. Chromium today has Blink, which is its own rendering engine that diverged from that lineage. But at that point, there wasn't much appetite for taking risks like that.

The Chromium rendering engine is based on WebKit, which is the Safari rendering engine, which is open source. And that itself was based on an earlier rendering engine from the Linux world called KHTML. And so yeah, it's really interesting to go back. times. You can see how these sort of open source projects fork and branch and so on. There's code in Chromium that comes from the Mozilla project too. Going back to the 1990s, you can find this lineage.

So browsers are this layer cake of technology that's been built up over time. And really where we are wanting to innovate is that next layer. where the AI model comes in and how it's articulating and interacting with the foundational layers. And so far as building on top of Chromium gives us this well-known foundation. We built on top of Chromium in a very different way than normally browsers would. Most browsers are just taking Chromium and forking the UI.

layering another UI on top of Chromium, but running right in the same process as Chromium. What that means is that if Chromium's doing work, your application's not doing work. The structure that we set up with OWL, it means that Atlas is able to work in parallel with the rest of all the activities related to rendering the web and producing web pages, which is pretty cool. So if the browser part crashes...

Yeah, if the owl part crashes, if something goes wrong with that piece, then Atlas can restart owl.

That's actually a really interesting thing, because I remember back when we were originally doing the design of Chrome, there was this thought of, well, the web page might crash, and therefore your browser should be around. And then because Chromium has sort of become this very sophisticated platform for web pages, it itself... has become super complicated and now Atlas is this very lightweight frame around the outside that really is about that core productivity use case of using a browser.

with ChachiBT kind of as this tool that you can bring down onto any page. And that's really where its focus is. Whereas the Chromium, i.e. Owl piece, is able to focus on being that platform. parts are not really they're resilient to each other's you know difficulties yeah this is true in that sense. Yeah, just actually another fun fact about OpenAI and the benefit of the system that we've built. Every engineer that starts at OpenAI merges code on their first day.

And if you think about how massive Chromium is as a platform, it's really super powerful, but it's a lot of code. It takes a while to get all that code onto your device. Just a little bit complicated. Yeah, and it takes a while to build it all. And so...

We tend to get our new hires in the afternoon as they've gone through all of their onboarding training. And then they have to merge a change. So to be able to check out all of that code and build it and then make a change to the code in your first afternoon can be pretty tough if you have.

to do all of that. But because we have structured this in a way that they don't have to, you can go and make a change to the Atlas side, get that checked out and built very, very quickly. Our engineers are able to be productive right away and merge code on their first day. like ship features. Related to this, always when you're starting a new project, you get to make new technology choices. When we were starting Chrome, we got to say, what is the latest and greatest way to build code?

uh fast forward to starting atlas we're like what's the latest and greatest way to build a native app on mac os so of course we're going to be using swift we're going to be using swift ui where where it makes sense and we're going to be you know using all these um

The app is built in Xcode and just done in a very familiar way. So people who are used to doing Swift development, maybe because they're building iOS apps, they can come in and just be instantly productive because this is not a foreign code base, not a foreign system. And yet it's harnessing the power of Chromium at the same time, which is super complicated. When you look at this from the outside, a lot of people draw comparisons to go, well, you know, there was Operator, now they're doing...

ChatGPT Atlas, is this going to be a real thing for them or is this another experiment? So a browser, I think it's a super core tool for productivity.

Why Build on Chromium? Technical Decisions

And it's something that you need to be able to count on. And so my view for this has always been that this is like a long-term investment. And so that's the way that we're approaching it. There's a set of functionality that we've launched, which is sort of like the first phase, if I come back to browser history. I sort of say it's the Netscape 1.0, if you like, of this new era of web browsing.

So there's going to be a lot of future improvements to come. Features that we're building because people have told us about it from the initial set of feedback. Things that we come up with through our partnership with research. A whole host of new functionality that will come out over the course of time. The other thing that we hear from folks is they want to see this across.

Different platforms. Yeah, basically this browser isn't available for my Windows device or on my phone or that sort of thing. So these are things that we're thinking about and working on. So there's like a long roadmap of enhancements. And so we want people to... both feel confidence that this will get better over the course of time. It actually gets better every week when we push an update.

And they will see it increasingly on more of the surfaces where they are. So definitely a long-term investment for us. Yeah, we got this app to the point where internally, the users at OpenAI, we were seeing them enjoy this product. And, you know, and we wanted, we were at a point where the kind of feedback we were getting was, you know.

Why haven't you shipped it yet? Why haven't you shipped it yet is exactly what I was thinking. Because it's like we weren't getting new feedback. And we realized we're ready to ship this. We want to share this with the world. We want to hear how other people are experiencing it. And I think that, you know, this feedback's been all kinds of amazing. You know, obviously there's been...

you know, people who have their pet feature that's missing, or like Ben said, you know, how come I can't, you know, where's Windows when? But, you know, the other day, my 14-year-old son came back and he's like, oh, man, my friends, it's... it's cool, they love this browser, you know, kind of a thing. And I was asking, like, really, what do they love about it? And then he just was like talking about all the different AI features that they were checking out.

I think it's really interesting to see that sort of spark of fascination from people, whether it's kids or, you know, people we work with or friends as they sort of share their experiences with this thing. I don't know, I just love to see how people, my wife was so giddy when she first got to try Atlas when we did the friends and family testing. And just to be able to go and explore researching some tasks and asking.

uh the side chat about uh what she was looking at and just she had so much fun with it my wife loves it this is not a lie an exaggeration the night that it came out we're sitting in bed i'm reading and i look over how i do and she says i swear to you i swear he goes I can't stop thinking about ChatGPT Atlas. Because for her, it was her first use of an agentic system like this. And to be able to go do these things was a huge unlock. It was her favorite tool, ChatGPT, now connected to the browser.

yep exactly because i'm just the tediousness that it takes away when you i'm looking at this website Or I'm doing some research and I can just ask it now about the thing I was talking about it to it before. And now we can take this as context and we can kind of keep going and research and whatever the topic is, it just becomes a lot easier when the AI. when the model is right there with you.

I think a lot of folks struggle with how to do, sometimes what seem like very complex tasks on certain websites. One of those websites, I think, is if I'm on a spreadsheet, pick your favorite web-based spreadsheet program. How do I visualize this? data in a certain way. If you just have a tool there that you can ask, drop in and help in a very consistent way. I think that's really interesting. The other piece that I think is interesting related to your wife's reaction is

This is a lot of net new capability, I think, for a lot of folks in the world. What I'm really excited about with this tool is that our model capabilities are always evolving. So at any given point in time, it's not like sort of the ultimate state of it. But we kind of get to show people how this stuff works. And I think with that, build some more understanding, some more trust.

about how this technology is working on your behalf. Even if we came up with the world's best model tomorrow that solved every problem perfectly in the way that you personally would have wanted that problem solved. you probably would still want to be able to come along for the ride and understand how that...

was actually happening. Just for your own education. I think over time, as your trust level builds, eventually you'll feel comfortable fully delegating very complex tasks to this technology. But we're not at that stage yet.

One of the things I'm really happy and excited about is that people are able to come and observe the next step of this technology and watch it, see how it does, tell us, yell at us when it doesn't work, that kind of thing. But I think you can kind of get a feel for how... it works and as a result like you will know

like what it's going to be capable of. And you'll kind of know what the controls are, like where you want it to stop doing something or you want to do something differently. You can just tell it and so on. I think you can kind of dial up and down how much you use the model and the ways you use it in this product. But it's kind of like right there.

Parallel Architecture and Developer Experience

It's easy to try again. And I think sometimes the magic I've felt with ChatGPT is when it really works so well that I'm like, oh, I'm going to go back for that, right? You have those magic moments where it's like... oh my gosh, I'm going to change my habits because of how this works, right? And I think in this case, it's like easy for people to...

use familiar patterns, right? I open the new tab page, I do a search, it lands me into an experience that includes some links I can click on, but there's also the model response. And so you start to learn that like, maybe instead of... that website I was looking for that was going to answer my question, it's just right there. And I can go and explore that and I can ask it another question. You start to learn the power of this model just by virtue of using the product in a familiar, normal way.

And so for me, that's kind of exciting to see how people, it might open the door to people starting to realize just what the capabilities of this model are. Looking to the future, first, let's start short term. I can understand how a Windows version, you're going to pretty much want parity of what you have right now in the Mac version. But when you talk about mobile, and it's one thing when I have a lot of...

desktop space and i can put a sidebar and have the chat thing there but when you're talking on mobile browsing as you guys know especially is a very tricky thing and moving a search tab from the top to the bottom or whatever seems like a revolutionary change How are you thinking about that, the mobile experience? And also, is it going to maybe, or are we going to be thinking more agentically in how we use these links? So maybe fun fact, some of our initial explorations were actually on mobile.

And part of the way we were thinking about it was really, what does it mean to bring the model to the web? And that can take many different forms. And of course, ChatGPT exists as an app on your phone, as you can imagine ways in which you might. share to that the kinds of memories that have been generated by using Atlas, right? So there's a lot of different flavors and forms in which mobile could come.

how this can manifest on mobile. But at the basic level, it's like, what are some interesting, you know, we're going to be looking at like, how can we bring the web to the model? What makes sense on mobile? The UX may look a little bit different. Obviously, as you point out, there's a different form factor there. We've got very talented designers, though. I'm sure that they'll come up with a good way to solve.

some of this stuff. Definitely, I think on mobile, or at least I noticed through my own use, Like my use case is far more information retrieval. Like I have a question, you know, actually I use the ChatGPT app a lot. If I have a question about something I'm around or I point the camera at the thing and like, what is this? You know, that type of thing.

situations like that where there is web content that is part of that journey and we want to make sure that user flow that you have with Atlas where you can view some web content, where you can ask follow-up questions and then go back and look at the content again, that that feels...

Very good. That's something that we're sort of in the midst of figuring out right now. And so, you know, like not too much more thought on it than that. Other than that, it's something that we're going to want to make sure is like very, feels very good.

Yeah, I think we hear from people that the importance of mobile because, you know, they are doing work on their laptop and they want to continue in some fashion on their phone. So you can start to imagine the kinds of space of possibilities there. Where are we going to be in five years with how we're using the web and how we use tools like this? So I would love to be in a place where people think less about...

the particulars of the tools they're using and are more just expressing what they want to the system. And then the system is smart enough to understand how to respond to that.

in a good way and so in that sense we can be as you know as humans we can be focused on the highest you know the questions the highest order um piece which is like what is the most um interesting for me to do you know the model maybe can take over an agent can drive maybe the less appetizing part of the work the more more i use the word toil to describe some of that sort of grunt work of

pulling information from a bunch of different sources. Maybe it can do a bunch of things that just seem very difficult to you because you've not done them before. It sort of knows how to do those things. And then you can be focused on the things that you want to do. So I imagine a world where... Actually, there may be a lot of internet traffic in the future that is agentic. That may even be most of the internet traffic. I still see people as generating a bunch of traffic as well.

Long-Term Vision and User Feedback

But that should be, you know, it should be efficient. It should be, people should be focused on doing the things that they want to do and delegating more of this work.

to more of this toil, really, to agents that can take on a lot of that other stuff. And then, like, if you have to... make a decision on a project and maybe your agent comes back to you and gives you some choices you know if you're going to take a vacation do you want to stay at this hotel or that hotel you can pick between them you get to make a choice but all of the sort of grungy clicking around and scouring the internet

for these things maybe it it took the first pass and presented you some choices we've talked a bit about how as users and developers of this tool you look at it If I was somebody who had an e-commerce site, if I was looking and I'm going to be putting information on the web and I know that.

One, there'd been conversations, you know, for a while, like what happens when LLMs go search the web and now we get into the world of agents when agents are using this and plus, you know, LLM powered browsers. What advice, what direction would you be telling people to think? You mean the publishers? Yeah. Yeah. I mean, I think that it's really interesting. Like in some ways I think about the, maybe a little bit related to what Ben was talking about.

You see the more recent models, they've learned how to, based on the query, decide, hey, I should actually look at the internet to answer your question, right? And so I think it's really interesting how these models can help. you connect to publishers and web apps and whatever content that's out there, either giving you a snippet and like a citation to it so you can go deeper or... or even just connecting you to it, because that's actually what you, you know, if your intent was.

navigate to a site then it can help you get there. I think one of the things we've been exploring with Atlas is how to for example better handle and better serve navigational intents. Sometimes people come to their browser with absolutely the intent of I want to buy this product on this site. You know, and that's our job just to get you there as fast as possible. And so it's been actually an element of building Atlas was making sure we're serving those kinds of queries well.

Other times you just want to probe the knowledge of the model and have it go and research something for you. And sometimes that involves it needing to invoke tools to do that on your behalf. And so again, it sort of depends what Ben was talking about. I imagine a world in the future when you don't have to be so prescriptive about what tool you want the model to use, but rather it.

has this incredible palette of tools that it can draw upon, and some of them can be, you know, actuating your browser, things like this. Are we going to be using web pages in 10 years? I think so. You know, it's kind of a... it's kind of this fabric of this world where people are publishing through this, it's the core primitive, how people are putting content out there, right? So it's kind of the...

You know, the Internet superhighway, all that kind of those analogies. But it is like this open fabric for which people can publish. And I don't see that changing. It's the world's largest, you know, as you mentioned before, most open platform. And I think some of that.

power of openness is always going to make it attractive for people to The way I look at it is these tools, yes, they are able to understand that fabric, understand that internet, understand the content that's out there, but they're also able to bring that content to people and connect people to that content. And it can be very powerful. And again, it's all in service of what is that user's intent.

I think it's fairly interesting to think about how we can do a better and better job of that. It really serving the needs of those users. And ultimately, as people are putting content out there that's intended for people, if you're putting out content that you're, you know. It's a gallery for somebody to go shopping or something like this. We want to help people find that, help people get connected to that, help people with the journey that they're on and whatever that may be.

This is exciting. Thank you for sharing this. Any last suggestions, any power user tips? Definitely the scrolling tabs feature is a favorite for both of us. So challenge yourself, like I said at this point, very early stage, but challenge yourself with your curiosity on any given page. Ask a question on more pages that you visit. And you might be surprised with what you come up with. Awesome.

And we'd love to hear from everybody about how they're experiencing the product. So please keep the feedback coming. Definitely. Ben, Darren, thank you very much. Thank you. Thank you.

This transcript was generated by Metacast using AI and may contain inaccuracies. Learn more about transcripts.
For the best experience, listen in Metacast app for iOS or Android