An Advanced Look at AI Agents, Part 2 | #868 | CXOTalk podcast

00:00

What are AI agents and will they really change the world as some folks claim? Today on episode 868, we're taking an advanced look at agentic AI with a prominent VC investor. Praveen Akiraju is managing director at the venture capital firm Insight Partners, where he's focused on AI agents. This is his second appearance on CX, so talk to discuss Agentic AI Praveen. Welcome back. Thank you, Michael. Excited to be here. What are AI agents?

00:38

You can think of AI agents as application software that is able to understand a user's intent and be able to reason on that user's intent, come up with a plan and execute the plan. At a broad level, that's really what an AI agent is. So today, if you kind of look at how that task is done, you typically have an application, you have a human who comes up with the plan and executes the plan, right?

01:05

So the, the transition that we are in the process of making with AI is that we're now offloading a lot of that planning process and the execution process to these agents and allowing them to use applications as humans would, right? So that's really the paradigm shift that we are in the process of making. That's why AI agents are so exciting. What is the role of LL miss large language models on this

01:32

whole agentic AI world? It's important to understand that AI agents are not just all about large language models, right? You have to think about AI agents as essentially another software application which now incorporates large language models in in areas where they add the most value. So let's take a step back and think about sort of how applications were previously defined. You used to have a database, a

02:03

system of record. You used to have a software workflow that was built on top of this. And then you had a user interface. That's typically what your classical application looks like today in the SAS context. So the power of large language models is that they are able to insert into the stack at various different points and be able to kind of dramatically improve the productivity and the capability of these application software

02:33

models. So you can think of large language models as playing different roles in in in the sort of new paradigm we call AI agents. The first layer would be the user interface itself. You know, we are all now used to the ChatGPT interface where we go in and you essentially type in something or you can even speak to it and you know, it's able to kind of understand that context and comes back with a an answer as opposed to, you know, set a reference points or the 10 blue links, right.

03:01

So similarly, in the application context, you have a user interface now where the user can interact and basically say, Hey, show me the data for a particular region for a particular product and it and that LLM is then is able to take that input and synthesize that and understand the context of it as opposed to the user having to go and figure that out, right.

03:23

So that's one layer. The second layer, which is really important layer and it's kind of in some ways sort of almost the core of an AI agentic architecture is a reasoning layer, right, which is the ability to take the task given to the agent and break it down into a set of sub tasks. So for example, if you're saying let's go build a research report

03:45

on a particular stock, right? So if that is the the, the prompt that the user provides, the AI agent then takes it down and says, OK, how do I do build a research report on this particular stock? I have to go to, you know, Yahoo Finance or one of these public websites, get all the information about that stock, be able to kind of synthesize all of that information, create a report and then publish a report. So it breaks that down to specific tasks. That's the reasoning layer,

04:14

right? The next layer where the LMS play a role is the ability to go out and dynamically query information. So in the past sort of application context where things were static, the application could only look at things where it had an API called built in. So if it had access to a particular data store and had an API call, it would go pick that API call. Here we have an LLM that can reason and say, wait a minute, this information is available to me externally.

04:43

This information is available to me through this API call internally. And I need to get both of these pieces of information in order for me to execute my task which is to build this research report. So let me launch a a web search API. Let me also launch an API to my internal research store, put those data points together, right? So it's, it's the LLM has the ability to understand what sources of data needs to go get the information from for it to

05:14

execute on the task. Now once it does that right, they, the application software synthesizes it and then you come to the next step. It is. You have the output. A lot of really well designed agents have strong evaluation loops and this is really, really important, right? Once the agent actually comes up with the output, you test that output against, you know what essentially is the gold standard that you set up and say this is how a research report looks like.

05:40

So the agent compares the output it got with the gold standard and says, does this match it? And then it corrects itself. Did it matches that standard, right. So as you can see, you inserting the large language model at different layers of the AI agent in order for it to be able to make the entire process a lot more dynamic and interactive. And that's basically what is different between a classic application flow and AI agent application flow.

06:11

So we have this reasoning, we have access to this broad body of human knowledge, and of course that is different from traditional applications. How accurate are these agents today? What is the state-of-the-art? How useful are these agents in practice? We are quite early in this journey. You know, it feels like when you read the press you have, you know, AI agents are everywhere, right?

06:44

Or if you look go look at a website today for any software company, essentially AI agents are the way they are now expressing their products. However, I think in in terms of the maturity of the technology, both from the the important question you talked about how reliable they are, but also in terms of on the customer end user side, the experience of the end users in terms of the deployments and the scalability and reliability of these. We're still in, in the early

07:16

days now. I will say that there's a spectrum of your agents, right? So in, in, in the spectrum essentially, you know, at one end of the spectrum, you could have these consumer agents which are really focused on individuals. You know, we were talking about this before the show started opening. I just launched their consumer agent called Operator.

07:38

Very interesting. You know, it's for us as individuals part of the pro plan and the other end of the spectrum are essentially the next phase of evolution of RPA or robotic process automation right in the enterprise. So agents that take on complex enterprise work flows, right. So we have tremendous amount of activity all along the spectrum in terms of, you know, start-ups as well as in comments advancing the state-of-the-art right experimenting the I agents.

08:09

So we are in the early days primarily because there are a few things that we are still trying to figure out right now. Large language models by themselves are non deterministic. And, and I think what I mean by that is that and we all, we again, we've all experienced this. When you ask ChatGPT a question a certain way and you ask the same question in a slightly different way, you may get a different answer.

08:34

Right now that's getting better with some of the newer models, particularly some of the the reasoning models that are able to correct themselves. But the fact of the matter is that the core of the large language model is this sort of non determinism. And so a lot of the AI agentic designers today, builders today are working on in what we call scaffolding in order for us to essentially take the power of these large language models and harness them.

09:01

At the same time, making sure that we're able to understand that the non determinism exists and we have the right architecture to be able to handle that. So you get a reliable, consistent and most importantly, scalable AI agent. Where are we there? Because of course that non determinism, if you press enter, submit again on a prompt, it's going to give you a different

09:30

answer. Is great if you're wanting help writing an outline or some type of summary of a document because there's different perspectives you can take. But if you want it to run a task like book me an airline ticket, you don't want a lot of stuff all over the map. You just want one thing done. You want that ticket to be booked in the right place with the most efficiency and so forth. So it becomes a big problem I

09:54

would think. Yeah, and I think this is again, a central design consideration, right? When you're building AI agents. And, and I think we can maybe break this down into like 3 parts. So the first part is the task itself. And you gave an example of a task, right? So how, how important is it for you to get the task? Absolutely. Right Now in ChatGPT, you know, if it gives you a slightly different answer, it's like search, right is you're getting information, you're not

10:25

essentially making a decision. So you're OK with some level of non determinism because the human mind is able to correct for that, right? The other end of the spectrum, if you're essentially betting on this AI agent to execute a task consistently, it could be, you know, a enterprise workflow and a back end workflow, right? It could be code generation or, you know, you're essentially a customer service interaction agent. You cannot have that level of

10:57

non determinism. So, so that's one thing like how do you define the task? And I think that's a, a key question to ask. The second aspect that you want to look at is, OK, now that you say, let's say you, you have a task that you need to be accurate. And it's also again, important to understand that AI agents are

11:13

not all about just LLMS. There's a lot of existing software, there's a lot of reusing, you know, machine learning models, predictive models, which are deterministic, right, as well as the classic things that you as a software engineer do to build applications that go into making an AI agent, right? So I'll say this again, AILLMS are a tool. They're not the product, right? They're not the agent, they're a tool. So you have to understand, as with any tool, what the

11:43

capabilities are. So how do you so that's, that's the second piece. It's like when you think about the AI agent it you have to think about leveraging the right, the LM in the right places. Then the third piece is OK, so now you've figured out, OK, my large language model is going to help me with building a plant, for example, the reasoning layer that we talked about earlier. So most AI agents today essentially propose a plan and you have a human in the loop that then approves the plan.

12:12

So a good example of this is a lot of the coding, you know, Co pilots and coding agents that you have today. They're very again, and particularly with the the the cloud Sonnet 3.5, that was like, I think a step function jump in the ability of large language models to accurately generate code, right? The way they work is essentially, you know, the programmers interacting with the large language model. It proposes a plan which is approved and edited by the programmer before it actually

12:39

goes out and executes it, right. So the so-called sort of user in the loop, human in the loop is a very critical design component today in AI agents, particularly in that planning layer right now, there are other things you can do. And we talked about part of this, which is, you know, what we call like a, a reflection loop. So once the output of the agent comes out, you tested against another large language model which is trained with the right

13:05

output. So the model, the agent essentially tests itself to say, did I get this right? And it's able to kind of think on whether the output is correct and then make those changes and go back and iterate again, right? So these kind of reflection loops, the way you build evaluations, which is how do you determine the output is correct? And using that data to continuously improve the AI agent is again part of the

13:31

design process. So I gave you like 3 different things where you have to consider. There's a lot more we can go into in terms of depth, but at a broad level, you know, the take away is that you can and design AI agents to be deterministic. Assuming that you understand the task right, you provided the right scaffolding, the evaluation loops, and you essentially involve the human in

13:56

the loop. Today, AI agents work well when you have a human in the loop, and I think that's going to be the case particularly on this end of the spectrum where output needs to be, decisions need to be a lot more accurate. Subscribe to the CXO Talk newsletter so you can join our community and we can tell you about our upcoming shows, which we have great ones. We have questions that are stacking up on LinkedIn, So let's jump to a few questions

14:27

right now. And the first question is from Ravi Karkara. And he says, where do you see American universities on creating a skills workforce for the AI driven world economy? How will they learn to work with and deploy AI agents? I consider AI large language models to be a tool, right? Just as you know, we had cloud, mobile, all these fundamental platform shifts, the AI and large language models are a tool to help us accomplish our task.

15:07

So what I mean by that is it still is important for you to deeply understand what is the problem you're trying to solve with this particular tool. Are you trying to book an airline ticket like Michael you had mentioned earlier? Are you trying to execute a payroll function in in an enterprise? Are you trying to respond to a

15:30

customer support question? So understanding the actual task, which is what, you know, essentially good product management is, is foundational to understanding how this tool, this new tool, much more powerful, obviously dramatically more impactful than anything that we've seen in the past, is going to change how we as a we as basically educators, as knowledge workers or as consumers leverage AI.

16:01

So in terms of, you know, how universities approach this, it depends on sort of how you participate in this. I mean, to one end, the spectrum, you know, the, the education in, in sciences and math and, and good grounding in that helps you be part of the design process of this. On the other end of the spectrum, you know, if you're more business oriented, you know, a deep understanding of your problem, how and essentially how the tool works. I mean, we all are experimenting with GBT today.

16:33

I use it differently, My daughter uses it differently. You know, I hope hopefully when none of her teachers are listening, but you know, she uses it sometimes to do help her with her homework, right? And so she's learning as part of that process just as I'm learning to use it for my use cases. And so are all of us, right? So I think it's a tool that we experiment with, but a deep understanding the problem and figuring out how to understand how to use this tool is, is foundational.

16:59

So, you know, product thinking is I think is another key aspect, which I think is something we should emphasize. You know, if you're not deep in the algorithms in math, you still have a tremendous role to play by understanding sort of how do you build products, how do you solve customer problems, right? And I think the third aspect of this is there's a huge human element to this. We just talked about sort of how do AI agents function today?

17:24

And one of the things I said is human in the loop, right? And you know, real AI agents get better at reasoning probably. I mean, the O3 model is amazing, right? And you've seen huge advances just in the last, you know, few weeks last, you know, later part, later part of last year in terms of the reasoning capability. However, throughout the steps that you have to think about where the role of the human is and at is it an evaluation function? Is it a reasoning function?

17:53

And so humans are always going to be involved and being able to sort of understand and engage with the technology is effectively the most important thing, right? There is a massive human side of this as much as there's a technology side of this, right? So those are some of the areas. And, you know, I'm not an educator. So as I approach this like, that's the way I think about it.

18:15

This is from Suresh Babu Madala and he says do we need to train agents or do agents interpret the question as a step by step task? That's a great question. You do need to train agents. Again, there's a spectrum of what these agents can do in terms of simple tasks, which is

18:34

really complex tasks. And the level of training obviously will depend on what you're expecting the agents to do. You know, the first phase, as you're inserting the agent into a particular task, there's a certain amount of grounding that needs to happen.

18:50

And the grounding typically is, you know, connecting it to the right data sources, connecting it to the right application sources, connecting it and, and understanding, making sure that understand and grounded in the policies of your particular use case or your particular enterprise. So that's the initial part, however training, you know, just like we as human beings, right, we're constantly learning,

19:12

right? And if you, let's say you onboard a new employee, you know, fresh college grad, they are in a constant training process, right? They learn you have somebody who watches their output and you give them feedback, right? And you hopefully you observe them continuing to improve. It's the same exact thing for AI agents. That's why these evaluation loops are so important aspect of designing an AI agent. I can't say that can't stress that enough, right?

19:38

You have to be able to kind of constantly understand the output of the agent, figure out where you can correct it and continue to improve. And I think the the last part of this is the observability of how these agents are functioning, right, where are sort of the broader efficiencies and inefficiencies of what they are doing and not doing is a constant part of sort of your architecture. This would be an excellent time for everybody listening to subscribe to the CXO Talk newsletter.

20:08

Go to cxotalk.com. If you're watching on LinkedIn, take a look at the address on your screen and subscribe to our newsletter so we can notify you of upcoming shows and you can be part of this amazing, amazing CXO Talk community. Our next question is from Greg Walters. And you kind of address this a little bit. But, he says, can't non determinism be prompted into existence? The first aspect of this is to really understand, you know, the role of the LLM in your task,

20:43

right? And what is the level of reliability that you expect from an LLMS task? Now there's certain things where the capability of the LLMS is getting constantly better and that you know that there's a reason why a lot of large language models are fundamentally trained on math and on coding tasks, because there's a lot of transfer learning as they get good at coding, as they get good at math, they're also able to get good at reasoning tasks, right, which are much more broadly

21:12

applied. That's why there's a lot of focus on those tasks. So, so the first point I'll I'll say is these models will progressively get better at managing the hallucinations, whether it is through reasoning loops, right, or whether it is through better post training of the models in your deployments or whether it's inference time reasoning, right, which is sort of another, you know, scaling

21:34

level that we now have. There are different techniques right in the model itself, which allow you to decrease the aperture of the non determinism, right? So that is one vector. The other vector is, as I mentioned earlier, building a scaffolding around it, understanding that you're going to get an output from the large language model that needs to be synthesized into something that's a little bit more reliable, right, and gets to the level of output that is acceptable for you.

22:01

So that's a reflection loop. That's your evaluations and that's, you know, and sometimes you may just have a deterministic runtime that you need, right to design to the AI agentic workflow. And Gus Speckdash says it's interesting that Agentic AI goes around the ridiculously frustrating prompt user interface that is not integrated with any workflow. This is huge. What are your thoughts about Agentic AI? Simply around the user interface and integration with workflows?

22:35

This is the quantum leap in my opinion, right, In terms of the user interface and user engagement. In a lot of ways, I think we've had some form of natural language processing, NLP, right type interfaces. You know, you can think of chat bots today. It's hard to escape them. You know, if you're trying to book a ticket or anything, like the first thing that pops up is a chat bot that's trying to get your information right. So that's natural language processing it.

22:59

You know, it's able to understand voice, translates it into text, you know, does a search and responds back to you. I think the opportunity with large language models is the ability to infer context, right? What we had in the previous generation of chat bots was a literal translation, right? And a static sort of rules based interpretation of that

23:20

translation. So what large language models have now, because they've trained on like the entire corpus of human language data is they can infer context, they can infer tone, they can infer, you know, the particular sort of intent and they're able to then appropriately translate that into into their query, right, and get you back a response. So I think it's game changing that you would have a large language model in a user interface perspective.

23:47

Now remember, like we're still today, still very text based, right? Most of ChatGPT interactions, though, they have voice mode, which is amazing, right, are still text based. But think about the ability for us to be multimodal, right? Our ability to do voice which were there today, ability to input images and right, which

24:06

we're going to get to right. And over a period of time these models will effectively get to this point where we call they have a world understanding, understanding of the world model. And I think that could be game changing in terms of how we interface with this technology. And this is from Arsalan Khan, who asks a very interesting question. He says we want bias to be removed from data when it comes to AI. How do you remove human bias if humans are in the loop?

24:34

And who decides what's a bias or not? It's a really interesting point. If our vision eventually is that as some companies have articulated that every company has an AI agent and that's sort of the first point of interface for a customer interacting with the company, right? Let's say you're an insurance company or you are a, you know, you're even like a government service, like maybe the DMV, right? At some point that interface is

25:05

really important. So I, I think look, they obviously the model companies are doing a lot of great work in making sure that we are conscious about bias. Now it is a fact that that is not a perfect solution yet. We're not quite there yet in in certain instances, you know, you're not getting these perfect answers. So part of this is the context that I talked about in terms of how you ground the agent is really important. And So what does context really mean? Right? Context means examples.

25:38

And so if it's a customer agent, for example, you could train it on the policies that you have. You know, you probably have a lot of voice recordings of existing agent calls, right, that are great examples of how to handle situations in bias or confrontational situations. So the training of this agent, grounding it in the policies, right, and the rules of the particular use case is a particularly important task. Now, I think look, the human in the loop is really about sort of

26:11

how you reason things. And obviously look, the way a human interacts ideally is a positive in terms of our ability to eliminate bias, right? By inserting human in the loop, you're actually adding a step that improves the ability of the agent overall to be, you know, to correct it's biases. But I would say, you know, good training, grounding and policies, right? And obviously, you know, having responsible humans in the loop are the things that will help us

26:39

get there. But it's an imperfect process, and that's why I say it's early days yet. Arsalan Khan comes back and he says if subject matter experts train AI to create AI agents, would we really need the subject matter experts? What happens when AI agents come across a scenario that they haven't encountered yet? Training is a constant process for AI agents, right?

27:07

What you get when you initially start this process with a subject matter expert helping you ground the model, ground the AI agent is you're giving it a certain rubric. You're basically saying like, hey, here's basically what is expected, right? Here's a basic task. Here's how you perform the task right now. The task evolves. We are in a dynamic world, right? Let's say you're in a company, maybe you launch a new product, maybe you expand into a new

27:32

region. You know, maybe you have you, you make an acquisition or you have new employees on boarded, right? It's a, it's, it's a process of constant change. So to some extent, I think you can design the AI agent to say, OK, I can expand into a new geography. This is how I understand it, right? But they may be different

27:50

policies. I mean, as it's often the case, if you're operating in a different country, they may have their own, you know, regular rules and regulations that you need to, you need to now kind of ground the model in and such like. So I think the subject matter expert is really, you know, as, and we all do this, right? We're all subject matter experts. We're not static, right? We're constantly sort of learning, evolving, understanding, right, new

28:14

technologies, new pieces. And it's the same thing for the model. So yes, the idea of the AI agent is to take away these sort of mundane things, OK, go get this document, put 10 documents together, create a research report, right? So that yes, you don't need to

28:29

retrain that thing. But being able to say like, OK, how do I operate in the European Union, which may have a different set of rules, or in in Asia, in, in Japan or in China or in India, right, which may have a different set of rules. Those are things that there's a certain amount of requirement of understanding those rules and regulations that need to, again, you need to kind of train the agent on, right. Let's start talking about business.

28:54

And Mario Garcia asks. He says it's inspiring to see the impact of AI in Fortune 500 companies. What insights about this stand out to you the most? A lot of these large companies today, I mean, it's been amazing to watch how rapidly, you know, both sort of large companies as well as incumbents and start-ups have really embraced AI and large language models. And it's largely, I think the ChatGPT moment which unleashed this because it, you know, it

29:32

was so accessible. So you take a step back. You know, AI has been around for a long time, right? You know, I, I did a course in neural networks back in my, you know, grad student days. What changed I think in this generation is that today really powerful complex AI models are available on the other end of an API call, right?

29:54

So that level of simplicity in terms of access to really powerful technology is essentially what enabled us to unleash large AI as you see it in in a lot of these use cases. Now I will again caution you to say that we are still very early. If you talk to a lot, a lot of these large corporations, they do have large language models integrated. Most of them have rolled out, for example, some form of a coding copilot. They've rolled out some form of, you know, customer support

30:24

function. They've rolled out some form of an analytics sort of use case with large language models, but we're still not deployed at scale, primarily because we're kind of still tuning, tweaking, right, learning in terms of how do you manage these AI agents? How do you manage biases? As one of your audience members just asked, how do you make sure that they are current, right? How do you make sure they don't hallucinate? The most important thing of the,

30:52

and this is fundamental, right? We all know this. It doesn't matter. There's a lot of really, you know, fun, exciting, interesting demos on, on, on X, right? And that's, those are like, great. You can say, oh, wow, this agent can do this. What's really important is can it do it consistently and can it do it at scale? And so those are the questions we'll answer this year hopefully, right? And that's why we're so excited in 2025 about the trajectory of

31:18

these AI agents. Praveen, you and your team put together what you call a market map of companies involved with AI agents. Can you tell us about that? And I can bring it up on the screen so everybody can see. And there it is. Praveen, can you talk about this market map that you've put together? The market map is a dynamic living thing in the sense that it will evolve constantly, primarily because we're seeing so much activity, right, and energy around, you know, the AI

31:57

agentic space. So what we tried to do is to basically construct this in, in layers, right? So there's, there's like a foundational layer where there's a lot of these kind of data sources, integrations and such. Like there's this new sort of bucket we, you know, we call sort of the agent computer interface, which is the ability for AI agents to use computer, you know, tools, right?

32:24

And you know, it could be integrations, it could be, you know, web tools, some of the stuff is integrated in models, some of these are, you know, you have interesting kind of platforms that are created for this. So we try to kind of construct the model as, you know, layer by layer. What's the bottom layer, OK, where all the data platforms, right? What is what is this sort of middleware layer, if you will, which is the agentic computer interface as well as a lot of

32:47

these agent frameworks, right? You know, we are investors in a company called Crew AI, for example, and I'm sure anybody who's talked about AI agents probably knows about crew AI. It's one of the most popular open source frameworks out there. You can use crew AI to build agents, say others like LAN chain, which are also do something similar. And about that, then what we tried to do is to say like, wow, let's try and kind of map out sort of where is the energy in the AI agent space?

33:14

And it's important to understand, I think it's on the left side of the of the of the market map. There's a lot of AI agentic products and offerings from incumbents. So obviously Salesforce, you know, launch their own sort of agentic agentic workforce agent force as they call it, right. Microsoft has Co pilots. We just saw open AI launch operator, which is more consumer

33:40

oriented agent. And so, you know, effectively all the incumbents have said like, hey, we've got these great customers, we've got all these great use cases. Is there a way for us to improve our customer experience or productivity by creating an agentic workflow on top of our existing software? On the rest of the market map, you can see sort of tremendous amount of energy in in different

34:05

verticals. You know, particularly in coding, for example, there's been huge amount of a great Workman cursor by far seems to be the most popular among developers today. But you also see very specific vertical agents, right, sales, marketing, legal, right? You, you see finance, right? So you can take almost each of these different functions and you can see companies building agents which are customized to that particular vertical use

34:35

case, right? This is really interesting company called Samaya, for example, that's building doing some amazing work focusing on building agents for the finance workflow, right? So you, you, you see that the idea was a market map was not to be precise, right? And capture the entire view. And you know, I do apologize to a lot of the the builders out there, some of whom we missed in

35:00

the market map clearly, right. The idea really was to kind of give you a perspective of, you know, what this is landscape look like, right? Where? Where is the activity? Like where? How are builders approaching the AI agentic space? What are the opportunities and the use cases, the predominant or the most important use cases for AI agents right now? They're probably like 3 or 4 buckets and you know, the first one obviously that everybody knows and understands very well.

35:33

Are these coding agents, coding Co pilots, coding platforms, What do you want to call them? Right. And depending on, you know, the particular style, you know, cursor has a particular way of, of working. It's more like a copilot. If you take something like Devon has a different sort of way it, it engages, you know, it fires off a bunch of agents that you know, execute your plan and such like. But there's a lot of energy around developer facing AI agentic work, right?

36:03

And so you can see that in the market map as well. The second area is in the customer experience section. So customer experience, obviously everything from like customer support agents, which obviously is the biggest use case. We, we, as you, as we were talking about earlier, chat bots are already a fact of life. Can we make that experience much more realistic, much more sort

36:25

of, you know, engaging? So, you know, like me, you're not basically saying agent as as soon as you as soon as you get a a chat bot, right? So there's a ton of energy in that space, lots of lots of great companies building interesting products. The other area of it's just kind of interesting is in the operation space, right? So if you think about operations, broadly speaking, it could be IT operations, it could be security operations. You have sort of this needle in the haystack problem.

36:52

You have a lot of data, you have a lot of like alerts that come in and you're trying to figure out like, OK, which ones do I pay attention to, right? So this is actually a perfect use case for AI agents, the ability to synthesize all of

37:04

that information. And if it's grounded in your policies and in sort of in in a company sort of particular way of doing things, it's able to like, say, like, hey, here's maybe the top three to five alerts you need to pay attention to. This is the problem that they're articulating. And here's a few ways to remediate this. So, you know, we were talking some interesting startups that are actually focused in this particular space. So I think those are the three.

37:29

And, you know, there's a lot more, but I'll just, you know, in the interest of time, maybe I'll just pause there. We have a question from Elizabeth Shaw, and this is on Twitter. Who asks how are organizations using agentic AI in their business and their ecosystem? Let's just take for example, a customer service AI agent, right? So that's that's a use case that we're seeing a lot of customers experimenting with. So what is what is this customer service agent do?

38:01

So you effectively again, you think of a chatbot today, the customer service agent is able to first of all, sort of be grounded in all of your data, your FAQs. You know, how do you, I mistype my password, how do I recover my password? Or, you know, how do I, you know, whatever, add, add my child to the, to the insurance policy or whatever. You have these things continuous task, you know that, that you were able to then interact with

38:31

an agent. The agent understands what you're trying to do. Maybe you're looking for a form, maybe you're looking for a website, maybe you're looking for a particular sort of quote or something like that. So I think that is a use case that is getting a lot of traction. We're seeing a sort of a lot of our customers looking at sort of, you know, rolling that out into production. The other use case that I mentioned is on the coding side, right?

38:56

And no, we're not just talking about like an IDE like like Cursor, which obviously has a lot of broad adoption, but things like automated testing and provisioning, right? So you have software you need to roll this out. Testing is a really critical part of that process of rolling things out. You, you're able to actually use AI agents, you know, very effectively in sort of that testing and sort of, you know, red, red team kind of use cases where you can like see if you can break it, right?

39:28

That's a very critical function where you're seeing some level of, of deployments happening. The other one, as I said, is like in IT operations, right? And this, this is very exciting because again, if you, you know, these are mission critical, right? You need to be constantly up, you know, most of these IT teams, you know, you always have a 24 by 7 coverage because you cannot have, you know, critical

39:53

systems going down, right? And so an AI agent is perfect because it's it, it has the ability to synthesize large amounts of data. It has the ability to basically, you know. Needle in the haystack, right problem now again, work in progress. I wouldn't say all of these things are at perfection, but we are definitely seeing these. And then I mentioned this company Samaya, which is very interesting.

40:18

They are actively building like a finance analyst kind of agent, which is, you know, which is pretty accurate in terms of being able to extract context out of research and and provide you really focused information. So you say that it's really accurate. I'm assuming that what you also mean is that it is consistently reliable and predictable. Exactly. And it's much harder to do that if just straight out-of-the-box, right.

40:50

I think a lot of times there's a lot of confusion in the market about like, hey, wait a minute, the large language models are just going to keep getting better and better and better. And, you know, ultimately they'll just they'll be like one model that solves all, right. And, and I think in it is true in certain simple use cases. I mean, absolutely the models are getting better, their reasoning better.

41:09

Maybe we will get to a point of artificial general intelligence, right where, you know, these models can just use, you know, computers like we do. And, and, and maybe that's the

41:20

bar, right? But I think when you're looking at complex enterprise workflows, as I mentioned, the ability for the agent to be grounded right and to be accurate and to present data in the way the end user expects will require some amount of post training, some amount of inference time, right reasoning, as well as some amount of scaffolding in order for you to build the perfect agent. Can you talk about the economics of agents?

41:53

And then we'll jump back because we do have additional questions that have come in, but the economics are really important. So what are the aspects that Dr. economics and what do enterprise buyers need to think about when it comes to the economics of agents? You've seen sort of the whole spectrum of conversations, right? Everything from like with AI agents, we're just going to go and tally to outcome based pricing to like, well, you know,

42:20

it's it's still software. So we have to kind of figure out how do you make sure that you're able to charge for it appropriately. The fundamental question to ask right when you think about pricing is can you measure the value of the AI agent output accurately, right? So what do I mean by that? Let's say you have a customer support agent. You can basically say, hey, the customer support agent handled 100 calls, right? And you'd have taken me X amount of dollars to handle those 100

42:53

calls. The customer agent handled those calls. So I can attribute directly a value right to the output of that agent. It handed 100 calls. Each call is worth X. So there's basically a hundred X is basically the value of that particular agents task. In the other end of the spectrum, let's say you're doing, you know, a research workflow, right? So you've generated research report or you're helping an analyst basically with the research and you improve their productivity.

43:23

How do you measure the value of that? Right? You measure it by, you know, individually, like how, you know, asking the the analyst, like how, how much more productive were you right? And you know, it's it's much harder to quantify certain tasks versus certain other tasks. So I think the first question in the understanding economics of agencies, can you attribute value in a reasonably accurate way to the output of the agent.

43:50

So based on that, if you can, then outcome based pricing is essentially where we're eventually going to go to, right. If it's a lot more nebulous, then I think what we're going to see is some form of an evolution of the existing SAS pricing model. So you might pay a platform fee like for the agent thing and then maybe you hire an agent. So you pay on the number of times you run the agent, right? So it's some combination of that. So we see this again as a

44:16

spectrum. There's no like absolute here, right? There's a lot of experimentation today in some ways, like companies are still trying to figure out like, you know, how's the customer getting value customers trying to figure out. And if you're ACFO, right, you're used to paying subscription software, you know, like, OK, I'm I'm I've got X amount of licenses for one year

44:34

and I can budget that right now. If you go to sort of this outcome based pricing, again, if you don't have an accurate sense of value, how would you as ACFO budget right for these AI agents? So I think there's a lot of these things that need to be, you know, we're kind of experimenting and understanding eventually where this direct attribution of value, I think we will end up in the outcome based

44:59

pricing bucket. But there's also going to be a lot of these intermediary models where, you know, you want to make sure that the developers are getting a fair value for the product that they're building and the customers are paying a fair price for it. So ultimately when we reach the point where agents have discrete measurable output results, then we can move towards performance based pricing and until then it's essentially usage. Right. Yeah, I think that's a good way to put it.

45:31

We have an important point now raised on LinkedIn by Naresh Kumar, who is VP and General Manager of Product Management at Z Scaler. And he raises the question, what about security and agentic AI? And we haven't talked about that. So I'm glad you brought this up. Large language models help with this sort of needle in the haystack problem, which is inherent to sort of diagnosing

45:58

security problems. I kind of grew up in the networking world and be used to, you know, build these large global scale Internet scale networks. And a big part of the task was like, you know, if if there was an outage somewhere to, to debug, that would essentially mean like we synthesize tons of data, right, and figure out where we need to focus our efforts, right.

46:22

And security is the same way. You have a large aperture of exposure depending on, you know, the type of company you are, everything from your network to your applications, your devices to your individuals to identity, right?

46:35

There's like multiple layers, right when you think about security and so, and it's been a tough challenge, right in the security industry, we've had, you know, these platforms called CMS which try to bring all this together and be able to give you like a unified view where you're able to manage this. But you know, a security OPS centre, essentially the nerve centre of how most companies run their security operations. So I would look at the role of LLMS in security in three ways.

47:03

The first I think is from a operations perspective, I think it could be a very useful tool because he has the ability to synthesize large amounts of data and help in that needle in the haystack problem or prioritize it. I think it's a great use case. The second one is, you know, LMS integrated into the security products, right?

47:24

Will essentially, you know, you talk about again, a security agent, existing security software, right, being able to dynamically understand policy, right, dynamically able to respond to, you know, you're adding more users etcetera. I think you're able to sort of build those. We're seeing companies starting to build large language models into their software stack. Just as we talked about earlier, it's a tool right, where where it's useful.

47:52

The third I'd say is look, you know, LLMS do represent a new, you know, threat aperture, right? Particularly if the models essentially hallucinate or for example, in that psych OPS use case, you know, ignore or highlight or or miss critical, critical kind of threat.

48:09

So while you're designing, while we talk about all these agents being deployed in a customer support use case or, you know, a finance analyst use case, if those large language models are not sufficiently grounded, right, and they're not, you know, the data, the training set that they have is not protected appropriately, you risk not just hallucinations, but effectively, you know, a hijack of the entire AI agent.

48:35

So it's early days in that I think we've had some, you know, some interesting conversations with, with founders who are thinking about this problem deep ways and building interesting things. But yeah, you know, it's, it's, it's an problem space at this point. It's we will learn more and we will evolve the security architecture just as revolving with the maturity of the AI agents itself. OK. And obviously Z Scaler is thinking about this because he

49:03

asked that question. Yeah, they're an important player and they have a huge role to play right in in in our overall architecture. We have another question from Arsalan Khan. I'll ask you to answer this really fast because we're just going to run out of time now who says should we create a time and motion AI agent that assesses other agents if they have saved money in an organization, obviously referencing back to the pricing discussion we had

49:31

earlier? If you can listen to some of the industry luminaries talk about like we all have some of this army of agents, or we will have, you know, human employees and agentic employees. There is a requirement to one like train all these agents, ground all these agents, right, as well as to evaluate all these agents. So we talked about reflection loops in, in, in terms of like the specific sort of output governing these output. So similarly at a higher level of abstraction, which is the

50:01

value, right? Yeah, it is. You know, you could, it's a, it's an interesting idea to be able to say you have an agent that's constantly measuring the value of the output of other agents to ensure that they are meeting a particular mark. For example, if you're a customer support agent, right? If it's not deflecting whatever, you know, 100 calls a day or something like that, right? And the metric may vary, then maybe it's not performing

50:25

appropriately. So and you could have an agent that's it's, it's more like an operational function. So there are ways to, I think in this sort of agentic future, they are the worker AI agents and they're these sort of, you know, evaluation agents and you potentially will have manager agents at some point, right? You can, you can think of a future where there is some level of sort of different levels of hierarchy where you are actively evaluating and governing these

50:55

agents. But again, we're we're early days yet. We're a little bit sort of hypothesizing how this looks like. Gus Beck Dash comes back and he says is it really better to have the agency and the knowledge in one model or have them as separate, loosely integrated systems? It'll come back to the type of problems space that you're addressing, right? You know, in general, we know from just, you know, the things around us that there's no such thing as one unified body of knowledge, right?

51:29

We as human beings, there's so much complexity in sort of our world, in our workplaces, in our sort of consumer oriented lives, that there's no such thing as like 1 mega intelligence that essentially does everything for you. So, you know, I think we always solve problems by breaking down, breaking them down into smaller problems, right? And then using different tools to solve those problems and put

51:56

these things back together. That's sort of the, the way, you know, human workflow happens, you know, irrespective of whether it's a, you're building a chair right, as a project or you're building a complex set of application in an enterprise. So I will again hypothesize here and say that I don't believe in this sort of single unified agent. I do think agents will continue to get better. Maybe we will get to this threshold of artificial general intelligence. How will you define it?

52:24

That's another at worst conversation. But I think you will always have this notion of taking a problem, breaking it down, using different tools to solve that problem, to put it together. And you can think of that same architecture applying in an agentic world as well. OK, And with that, this has been an action-packed hour. Praveen Akiraju, thank you so much for taking time to share your expertise and knowledge with us today. I really, really do appreciate you.

52:53

Thank you. Thank you for having me. And a huge thank you to everybody who watch. Before you go, subscribe to the CXO Talk newsletter so you can join our community and we can tell you about our upcoming shows, which we have great ones, and you can ask your questions during the live show just as today. And with that, a huge thank you to everybody and to Praveen. And I wish everybody a great day and we'll see you again next time. Take care.

Transcript source: Provided by creator in RSS feed: download file

An Advanced Look at AI Agents, Part 2 | #868

Episode description

Transcript