Token Economy and AI Agents: The Final Hurdles in Enterprise Deployment

⁠¶ Comparing IBM Exec and Red Hat Conferences

00:00

It's very much based off of your requirements, your skills, your knowledge, your processes that now need to be defined within your AI stack. And that really is the last mile. And I think that's even I saw that from both conferences where the realization that there's still a lot of work that needs to be done to get AI to a point where it's actually very fine tuned, very functional, very efficient.

00:30

And right now it may work, but it may not be very efficient for scaling, it may not be efficient for cost, it may not be efficient for the new token economy that we're seeing. And the last mile is historically the biggest problem to crack. Right. And once you solve that problem, Amazon as a physical last mile in terms of how they actually execute on delivery, right, because you can have warehouses, but everybody lives in a different house, right? So there's a lot of little last miles. It's, it's

01:00

death by a thousand paper cuts, if you will. Proof of concept projects are everywhere. Real business value, that's the hard part. This is data Driven. I'm Frank Lavinia and with me I have a very special guest, Christopher Newland, who is a colleague of mine at Red Hat. And we're gonna do a deep dive. You, you travel all the time. I know that I travel two weeks back to back and that was a been

01:38

a while since I had to do that. But it is conference season and so you were at IBM think last week and you were also at Red Hat Summit this past week, as was I. I have my Atlanta I heart I Red Hat Atlanta T shirt on and so how's it going, Christopher? But yeah, it was nice because one of those two IBM think was actually in my area of Boston, so I was able to attend that locally. Still, still a lot

02:15

though. Like, you know, you're going in at like 7 in the morning to be, try to beat traffic and then you're leaving at like 10 o' clock at night. But it's very poor. Yeah, yeah. And those conferences were very, very different, targeting very different audiences. So it was, I felt like I got kind of two perspectives of the AI world and

02:38

what people are concerned about. One from a very like executive lens and another one from more the day to day users, developers, engineers who are actually implementing the AI. So which is which? I think I know the answer, but yeah. So I IBM think is an executive conference. So I think it's normally director level or above. So it's targeting a lot of C suites, senior directors, I think the most that you would ever, lowest you would See would be like a senior manager

03:13

of some sort. But for the most part it's a C suite type of conference. And a lot of the conversation there is more about the business return of AI and what does that look like this year. And then Red Hat Summit is very much about the system administrator, the cluster administrator, the sre, the developer who's actually utilizing these technologies and actually like implementing something

03:43

with it or managing something with it. So very like two different lenses to the same challenge within the industry. Yeah, no, it was interesting. And I don't know about you and what you, the attendees this year had much better questions, I think, than any other Red Hat event than I've ever AI questions than I ever seen before. Right. It seems like people are struggling to implement this in a way that is secure, stable, scalable.

04:16

And I think we also have a much better platform story this year than we had in previous years. Absolutely. So the way I've been framing it to people, it kind of goes into two terms. So the first term I've been using with people is that last mile. And that then kind of feeds into this concept that you hear a lot about in business and other industries called the 8020 rule. I think a lot of people are finding that 80, 20, what is 80% of the returns or

04:48

20% of the effort? And then what we find is that it flips for that remaining 20% of returns is now going to be 80% of the effort. And that 20% is what I've defined really as the last mile. And I think the conversations I'm having with people is that they now have the tools and they've had POCs and they're seeing results and they're seeing even a lot of times good results. They just don't know how do I get it to the point where it's actually returning investment, whereas

05:22

roi. And this is a question that was happening at both conferences,

⁠¶ Challenges in AI implementation

05:26

both from an executive lens point and from the, you know, the general day to day developers. And this is where I think open source is set for in a great position because there's so many open source tools out there that we can work with people on, you know, finalizing that last

05:45

mile. I think what people are most annoyed about though is that there's not a magic button that's going to fix it because it's very much based off of your requirements, your skills, your knowledge, your processes that now need to be defined within your AI stack. And that really is the last mile.

06:07

And I think that's even I saw that from Both conferences where the realization that there's still a lot of work that needs to be done to get AI to a point where it's actually very fine tuned, very functional, very efficient right now. It may work, but it may not be very efficient for scaling, it may not be efficient for cost, it may not be efficient for the new token economy that we're seeing. And the last mile is historically the biggest problem to crack. Right.

06:37

And once you solve that problem, Amazon as a physical last mile. Right. In terms of how they actually execute on delivery. Right. Because you can have warehouses, but everybody lives in a different house. Right. So there's a lot of little last miles. It's death by a thousand paper cuts, if you will. Absolutely. And we saw the same thing with microservices

⁠¶ Challenges in scaling microservices

06:59

back in the 2010s where there are a lot of organizations that developed microservices but then had a lot of challenges and had to overcome a lot of that last mile when it came to data domain. And you know, where, where does your data exist within this microservice architecture? How do you do contracts and handshakes between services? How do you orchestrate these services? How do you scale them? You know, in many ways this is the problems that we saw kubernetes kind of develop out of.

07:34

And now we're seeing being embraced now by a lot of the same challenges we're seeing with agentic systems and AI and how do we scale that out efficiently? So I love what you said. It's not a new problem. It's just the same problem we've seen reiterating over 50 plus years of compute history that now just has a different lens to it of the AI problem now. But a lot of the same solutions are still the solutions that we had for many of those advancements in technology

08:07

that we saw, you know, over the last few decades. That is interesting because like, you know, you know, Kubernetes does has solved a lot of the same problems and it doesn't solve them all. But there's a significant overlap and I, and I got that sense from the conference that people are finally starting to get it. Like, why OpenShift AI? Well, because OpenShift solves a lot of these problems. You just put AI on top of those solved problems and it

08:33

doesn't fix everything. There's still going to be a lot more room for improvement in terms of how you implement that on your last mile. But it gets you halfway there from a get go easily. Yes, absolutely. A lot of the questions that IBM think were it was Actually funny, a lot of them are about IBM Bob and I know you and I have been kind of talking about this for like last two weeks, but at the IBM think IBM Bob was a very serious conversation of executives wanting to know how can they

09:10

mimic tools like Claude code. Right. But within their enterprise setting. And the biggest thing about IBM Bob, that I learned actually at IBM think from both the engineers there and those who are interested in it, is that a big thing here is what they want for institutional knowledge. They want to keep a record of all that institutional knowledge from the prompts and the context and all the things that you know, are built out of IBM Bob, so that they can keep that information as institutional

09:44

knowledge. Really being able to then take that knowledge and then kind of re injected into their broader agentic engineering. And I think that's actually the, you know, I don't think IBM Bob is actually really meant to be a clone of cloud code. I think it's really meant to be a manager of institutional knowledge across many different. Yeah, so we have a special guest, a second special guest show up. This is Crystal, my little dachshund pup. And I had a pick her up from

10:13

chewing wires, but I was listening. But you're right though, there is definitely a Bob feels different. I don't know how to describe it. I had issues getting authenticated into it but the folks at the Bob booth, at the my IBM booth did help with that. But it's unfortunately named honestly I think because I think of Microsoft Bob and that was not

10:41

exactly a winning product. Right. But, but I, I've been playing around with it and I, you know, I had to do kind of the init process on, on, on a couple projects and was interesting because it suggested how to take those projects and turn them into MCP servers and agents, which the other ones Codex and Claude have not. I thought that was interesting and I didn't prompt it to do that.

11:07

It just basically said on its own like, you know, you could turn this process into an agent MCP server and things like that. While that was in the back of my mind as I, you know, built these various projects, it was not top of mind. So I thought that was interesting. Yeah, it definitely is not a clone. It's. It's meant to solve a new problem. Yes, I agree, I agree. And I think it really starts feeding into this bigger scope of things like

11:36

sports perspective and development. Right. And these other tools

⁠¶ Integrating AI with project management

11:40

of like how do we get the knowledge out of the project managers, how do we get it out of the JIRA and how do we get it into a way that the AI can interpret it but not lose that knowledge along the way? So as the prompts are coming in, as the context are coming in, it then comes part of that institutional knowledge. And I think that is ultimately what Bob is trying to achieve. That is very different than I think, what a lot of the other alternatives out there are.

12:10

My hope is that as this grows, we see more opportunities for it to become more open source. That's probably one area where it's a little different than what we do here at Red Hat, where I think we, I mean, we're not supporting the project, but a project that we're very closely buying is things like open code, for example, which is an open source alternative to cloud code. It's really interesting to see all these

12:35

different solutions right now. I also like the fact that Bob can be an IDE and mimic more cursor, or it can be a CLI and mimic more of a cloud code, which obviously with my background I'm more comfortable with the CLI side now. That was a big one. And I would say then obviously agents would be the second biggest thing. Just in general, that was the theme last

12:59

year at IBM Think. And that didn't change this year. I think we're just seeing the experimentation of agents now, moving into the solidification of agents in the industry. And I think we heard about agents a little bit at a high level at IBM Think. But then for Summit, everything was about agents. Everything went down to how does this implement to the agent, how does the inference of AI implement to the agent, how does the data implement to the agent? You know, the orchestration layer,

13:36

kubernetes, all these things. It all had to do with the agent. And that was really interesting to see how the conversation over the last two years has shifted from all of these individual parts. I think the last time I was on, on your show and I know you and I have talked a lot about, about how AI. There's been a lot of these parts, but nothing has kind of unified them. I think what we're seeing with AI agents is going

14:02

to be that unification. The agent will become the unification part of all these different parts of the AI industry where all these tools now will come together. And we saw a lot of that at Red Hat Summit. You don't think that'll be. Harnesses will ultimately be the container for that, where all these things will live and harnesses will be kind of like top level abstraction. This is a really good

⁠¶ Debate on AI model vs. harness

14:24

question because this is the big debate within the AI labs and the AI community, are you invested in harness engineering or do you think the models themselves will just supersede the harness and that they can be knowledgeable enough to basically function agentically without. So obviously the open AIs and the clouds of the world and anthropic. They're probably a little bit more on the model side because that would ultimately benefit them.

14:59

Right where I think the IBMs and the Nvidias and I would say the majority of the industry is probably a little bit more on the harness side because that allows a larger ecosystem of third party tools and something that's a little bit more familiar to people. I don't know. I think over the next year or two it'll definitely be harnessed because that's where

15:22

we've seen the most advancement. But with things like mixture of expert models just continuing to advance and how they can do reasoning and they can do a lot of agentic. It could be that we see the model layer chip away at the harness layer and is this going to be a back and forth and it really just gets also into how do you inject the context. And this is closely related to the same argument of is RAG still needed?

15:52

With context size growing so much, why would you need rag? And I think from an enterprise standpoint, and I think Red Hat is very big on the harness side because we see the need for different security layers, different different integrations into third party tools, different authorization layers, routing, networking that the model will not be able to manage completely, at least for

16:21

a while. And that's where I think the harness engineering layer will exist because there are all these existing technologies that the agent needs to integrate with and that's all going to happen at that harness layer and then be executed within that runtime layer. Yeah, that's how I see it too. I think the harness layer is really going to be. It may not be a foundational type situation where you build on top of it. I see it more as the mortar between the bricks.

16:53

I agree. Right. Like, and it's not

⁠¶ Discussing model evolution and limitations

16:57

that the mortar is more important than the bricks, but the bricks are kind of pile of rubble unless you have mortar kind of holding in place. That's kind of how I see the harness story evolving. But I have a hard, I, I have a hard time imagining models ever being able to be that far advanced. However, you know, we've gotten further with the LLM architecture than I ever thought we would. Synthetic data has been more. And distillation has worked better than I ever thought it would. So Take.

17:34

Take my thoughts with that in mind. Right. You know, when. When I looked at synthetic data and kind of distillation in particular. Right. There's a meme where they show, you know, somebody fishing in the. In the water, and then somebody is fishing from that guy's pot, and then from. Somebody was fishing from that guy's pail. Right. And then they show each subsequent fisherman was like, more and more distorted. We've not really seen

18:01

that come about. Right. It's not like you're copying VHS tapes where subsequent generation gets worse. I'm sure that if you don't do it carefully, you'll get some weird artifacts. But it's not been. That has not been a default case, which I think is interesting. It is interesting too, because most of the models that are out right now are distillations of actually GPT4 family. Right. Even the GPT5 is still a direct distillation of 4. It was not completely retrained.

18:39

And Anthropic obviously has their first generations and second generations, but we actually haven't seen very much new generation just because how expensive it is to create from. From fresh. And from. What I'm imagining is that they've tried and it's just they haven't gotten the results that they wanted. So I think that will be what we see. I don't know. I haven't heard

19:02

if Mythos is a. So if people aren't following the. With Mythos model from Anthropic, it's a. It's a model that they've withheld because supposedly it's too risky. I don't know if that model is. Is a whole new generation. I would imagine that it probably is. But to your point, most of the models are out there now, and what we know from the Chinese models, that they're all just distillations of

19:28

the American models. We have proof now that they've been mass API, hitting the GPT and Anthropic and Gemini to create the generation of Chinese models that we have now. So that's something. And they get. They're very performant. Like, those models are very, extremely good. Very good. I mean, it just shows you like this. This is not the paradigm of, you know, analog VHS copying. Right. This is more. More, I guess, in the style of, you know, remixing an old song digitally. Right. You

20:02

don't really get it's. It's not a well thought out analogy, Christopher, but. But, you know, you'll hear like, you know, a lot of techno songs in the early 2000s you will hear them on I don't go to Clubs Anymore, but on my, on my what's New and what's Hot techno playlists on Spotify. Right. I, I recognize the same backbeat, I recognize the same chorus, right. Like from songs from like 20, 30 years ago, right. Like,

20:29

and even sampling and rap music, right. Like, it's a bit more like that where you do get a completely fresh perspective based on older parts. And that's something that I did not expect. I, I just assumed that it would be some kind of. You would start getting really bizarre artifacts after so many generations. But that's not been the case. So, you know, I think it's interesting because we really don't. This is really uncharted

20:53

territory, right? These are. Yes, they're based on very well known mathematical principles. But like, as these systems get more complex, it's getting harder and harder to predict not just their behavior, but the range of their behaviors. Yep. One second. I'm going to grab something because we'll do a little bit of show and tell as well. Cool, cool, cool. So while you're away, I will maybe I can interview a dachshund. So what is, what do dogs think about AI?

21:21

Everybody and their cousin and their dog has a AI startup now. So what's your AI startup? Oh, a link shortener. Okay, cool. Because I get it. Your short legs. I get it. That's cool. While we're waiting for Christopher to come back, you all know I'm a big fan of Humble Bundle, so. Humble Bundle. Oh, you're back. Cool. Oh, you can finish your thought. Humble oh, so Humble Bundle. I actually, so I

21:50

worked the booth. I had a, A talk on day one and I worked the booth on the subsequent days and you know, a lot of people came by and other Red Hatters. Actually, I, I was showing them Humble Bundle. I'm sorry, go ahead. No, I just said that looks really cool. Yeah, yeah. So if you're not Familiar with it, humblebundle.com it started as games, but if you go and you pick store. Not store, I'm sorry, bundle,

22:27

you can pick books and there's comic books there. But there's also a lot of stuff here that is particular around software. Right. So in this example here, this is the books on practice exams for AWS and gen AI, all sorts of interesting stuff here. Security. This is actually a hybrid of like courseware.

⁠¶ Affordable Power BI Courses Bundle

22:54

So they also have software oops. Bundles that are, you know, sometimes it's kind of like image editing tools and things like that. But very often they Will have courses for, you know, how to get into Open Claw and things like that. And I know if you don't know, Christopher is really into openclaw, he helped me get my Claudia kind of up and working. But if you go here, I know a lot of listeners Data Driven are big into Power bi. These are basically courses on Power Bi

23:25

and things like that. And the cool thing is it's $20 for 17 courses and a portion of your cost goes to a charity. So it's really cool. You get a lot of material and you know, a charity gets funded and things like that. So definitely check it out. They often have AI books or, you know, app development books. A lot of things around game development too because that's kind of where Humble Bundles started. Nice. That's a great segue too because speaking of openclaw, so when I got home from

24:05

Red Hat Summit, this arrived. Oh, nice. So I haven't gotten a Mac Mini in probably like over 10 plus years. So when this came I was kind of like. Because I don't know if you remember, the Mac Mini was maybe about the same size, but it was much bigger than this. And I'm actually holding this with like one hand right now. But the reason why I got this is because Red Hat in particular wants to make sure that all of the agents that I'm running for Red Hat are isolated for runtime. So I could use

24:44

my. Let me see if I can pull it over. You have one of those framework things, right? This is the framework. Yeah. This is the size of it. So that is actually powering my home Lab that has OpenShift in it. I could do that. And that's actually where a lot of our tooling are going towards. But I also need a agent to have access to my email, have access to like more like my day to day tooling which actually exists more. More on a desktop. And that's where this guy comes into play.

25:18

It's interesting. Now we're separating

⁠¶ Separating and managing runtimes

25:21

the harness from its runtime and now I'm dealing with multiple runtimes. I'm going to have runtimes that probably run on the home lab and now we have runtimes that are going to run on this. This one would be. I need it to do something that actually involves some kind of GUI or something that's already on my desktop, which just there's no API set up for me to do.

25:45

Or I need it to do something, maybe some, something basic that's really easy to do within the Mac ecosystem where this like my home Lab, maybe it's an agent that's running diagnostics on like AI ops diagnostics on my home lab. Why isn't something up? Why isn't it working correctly? And this is where the whole concept of runtime now has become such a big thing. And I think it will continue to

26:12

become more important this year. Harness is kind of getting the spotlight, but we need to move more into this runtime conversation of okay, now the harness has put the context together, it's put all the knowledge together and the skills. The agent is running the agentic loop with the model. But now where does the output actually run? Does it run on your, your personal computer where it has access to sensitive information and you know, it could do things that it shouldn't or does it run in an

26:41

isolated environment? So this is probably going to act more as like a little server that runs here in my office where agents, this is just for agents. This box. Where is the inference run? Where does the inference run for your agents? Is it run? Yeah. Does it run on that? Does it run on your framework or does it run in a hypercloud service? So I'm actually

⁠¶ Using semantic routing for requests

27:03

doing a new technique called semantic routing. All my requests go to my home lab first. Within what we would call the control plane for the agent, there's a router that exists that actually evaluates the information that's coming in and decides based off of sensitivity and complexity where this route should go. About 80% of my traffic actually hits the framework for a model that's running within Vllm on the framework device itself on OpenShift and then about 20% where I've deemed kind of high

27:36

reasoning. Then we'll get sent off to our corporate Gemini account that we have within Red Hat. So this way it's also really nice because when I first started working with agents all the way back, I mean I've been working with agents for, for years and years. But our current modern day idea of what agents look like, back at the beginning of this year I was running out of tokens. I was getting throttled by, by Google and there was

28:06

nothing I could do about that because that was part of our corporate account. It wasn't anything to do that I could go and change the knobs. So moving to this semantic routing approach allowed me to not run into that throttling anymore. Most of my things go. So right now I'm running a quinn, the quinn 3.6 35B mixture of experts model. Nice. And that's running right now and doing all of my local agentic work.

28:31

It's doing most of the low reasoning tasks and then all the high reasoning tasks then get sent off to Gemini. So do you ever have it set up where the high reasoning task will divvy up a bunch of low reasoning tasks and then send that down to your Quinn? Or is that something in the works? I have experimented some with that. So that gets into some like post inference type of techniques that we've been experimenting with, myself included.

29:02

I haven't gotten that far yet. This is where areas such as like speculative decoding kind of come into play or post inference technique. Why would speculative decoding come into play here? Yeah, because there could be a speculator that sits at the local model that actually acts as kind of almost like a guardrail to the larger model where it can actually start reasoning about some of the things earlier on and decide basically acts as a breaker. I got you. And that makes sense.

29:43

That's where speculative decoding would be kind of the next iteration on that where it's really the management of knowledge and memory and cache at that point. I really haven't gotten into that with my local setup, but that's part of that whole last mile where memory I think will be the last portion of the last mile for everybody. It's going to be memory management, it's going to be cache management. When you say memory, organizational memory, not necessarily the physical memory.

⁠¶ Agent memory and compression basics

30:15

When I'm talking about memory, I'm talking about the memory of the agent itself. For OpenClaw, for example, every time it makes decisions, it keeps a compressed record of what it's done in these JSON files and then it will reference that your cloud code does something very similar. Every time you hit your token context window maximum, you'll see that it's doing a bunch of compressions and it takes a little thought. That's actually what

30:46

we call a form of memory. If you've actually been following the news. Even just today, Google IE just announced a whole new agentic memory platform, a framework that fits right into this. And that's why I think memory is going to be the next iteration on. On, you know, improving the agentic system. And that's not the KV cache, that's not your physical memory. It's not the agentic memory would be a. Yeah, it's like a gentic memory. It's how your agent has recogn

31:22

reconciling what it's doing doing and has. It's. It's outside of the context window, but it's not the KV cache. It's something that's like, oh, this is what I've done in the past and this is the context I need that I just need to keep carrying forward in my conversations. It's something that maybe it's not an MD file, it's not like permanent knowledge. It could get flushed. You could just say go ahead and flush your, your memory

31:53

and that may actually be what you need to do because maybe it's. There's a lot of nonsense in there or something that's doing something wrong. It's not meant to be long term. Think of it like human short term memory. Exactly what it is. Interesting. Not everything that we do is long term. So long term memory in this case would be the, your MD files, it would be your kv, potentially even like some layers of your KV cache where I

32:18

would actually consider that more like intermediate. But it's really that long lasting context that just keeps getting injected in where this concept of, of memory that we keep hearing about is more of that short term memory of what knowledge do you need to have right now to make the decisions that you need to make based off of the reasoning and the topics that you're working with right now? So a good example would be. I'm sorry, go ahead. No, no,

32:43

no, no, go ahead. No, good, good example. You like your hotel room number when you go to a conference, right. Like you're never going to cancel, you're having, you know, needing to remember that beyond once you check out is very low. Or when you get the two factor authentication, the six digit code. Right. You only need to remember that for a very short window of time. Yes, exactly. And that's a prime example where

33:07

you could long term forget that information. But in the short term it would be very detrimental if you forget your hotel room, you have to go and ask somebody and that, that takes time. Yeah, it takes time and that's exactly the same narrative. It's not that the agent couldn't get that information, it's just that it's faster for the agent to get that information if it's located in some type of short term memory. And that's where we're seeing so much advancement in, in these

33:32

agentic platforms. Did you want to, did you want to add anything to that? I know we're coming up to time, so I just. Oh no, I mean, no, I appreciate your time. I see the, I see that we' up on time and you know the. No, I think there's a lot. I think, I think the one thing I learned this week was it's very easy to think that you're behind everyone else. But you know, we've had people, we had people come in the booth,

33:58

like, I don't know anything about this, tell me where to get started. And I was like, you know, to hear that in 2026 was both shocking and, and refreshing. Right. You know, there were, there were people. I'm not going to name any names, but like, you know, there are people who are in our division and they've not even installed Claude yet. Open Claw. I mean, I always get those two confused, even though I know they're very different things. But, you know, who've not installed Open Claw, like

34:31

on their own? And it's just like I feel behind because I have Open Claw, but I don't have it as set up. Well, set up as you. Right. But I do have it, you know, so it's kind of like it's, it's, it's, you know, don't be afraid of being behind because chances are you're probably not. No, no. Part of the reason why I have the dog do that intro now, which of course is, you know, obviously AI generated was part of the joke of that was everybody on their dog is an AI expert now. And there's not

34:59

really any experts. There's probably about half a dozen people worldwide that really are on a whole other level. I mean, the Andrew Angs. The Andrew Angs of the world. The Jeffrey Hinton's of the world. Right. Like those are the people. Yan Lecun for sure. You don't hear much from Yahshua Bengio anymore. But you know, like people at that level, right. At that, that strata, like they are, they really are like that far ahead.

35:32

And it's always interesting seeing like what problems they're trying to solve. I think is very interesting. What is particularly interesting, I think it was. John Lecun is very skeptical of LLMs getting any further along. Yeah. Which I think is interesting. I mean, it's, you know, at this point almost a 8 or 9 year old concept of LLM transformers. The concept that he created. The concept that he created. Right. So the underlining layers. Yeah, yeah. So like a lot

36:01

of. Go ahead. I was gonna say there's a lot of new

⁠¶ New AI approach and vision

36:04

interviews that he has out in the last couple weeks about, you know, his new approach to AI and how he sees it superseding LLMs. And that'll be interesting too because he's looking at it from a whole new direction than just how LLMs just, they're just, they're just building the next pixel the next text where he's looking at it from a whole new direction of, you know, maybe we built this

36:34

house of cards wrong. We need to just kind of start over and basically like, stop, start at the basics and, and build something better from what we've learned. And it'll be very interesting to see what he comes up with out of all this. Yeah, I, I, because I, I'm surprised we've gotten this far this fast with LLMs. I, I really thought, like, the whole reasoning aspect to LLMs is something I did not see that they were, I did not, I would not have bet real

37:00

money on them being able to do that. Right. But here we are. Like, they clearly can do some level of reasoning. How much is probably debatable, but the fact that they're just, you know, you hear that they're just like text prediction thing algorithms on your phone where they predict the next word. Well, technically true, I think doesn't really tell the whole story. Right. Like, that's like saying that the, the F35 fighter is the same thing as a

37:27

paper airplane. Right. Like, they do have to apply, they do have to obey the same laws of physics, thrust, lift, gravity, blah, blah, blah. But they are very different animals in that sense. Very much. I agree with that analogy. It's really good. Cool, man. I love to have you on the show again. We could talk open claw. Yeah. You've done some crazy cool stuff with that. Definitely. I know some of the agents that you've built that people probably don't want me talking about because I know

37:53

you made a lot of it. Security people very nervous. That's true. But the stuff that you've been able to automate has been nothing short of like, oh, my God, that's amazing. And also super useful. Crazy too, for me is that I've been so busy that so much of the stuff that I did that people were talking about was like one to two months ago

38:16

and I think this summer. So there was actually a, a really popular podcast out of the the AI Daily Brief that went out where he was talking about how everything that's happened over the last six months basically came out of Christmas break. So, like, everyone went home and had a few weeks to just like, play around with this stuff. I was one of those people. So, like, so much of what I did came out of those,

38:41

like, experimentation phases. And I think I have to repeat that this summer because there's so much new things that we've learned.

⁠¶ Developing a multi-agent system

38:49

Right. That, But I still haven't built on top of that yet. And I think so for me right now, so many of my agents are doing very simple tasks. They're doing information gathering, they might be looking at meetings and suggesting that I read certain articles

39:05

correlating to something I'm about to talk about. But I want to go the next level where I get into like a multi agent system where I have like a chief of staff agent who's got one that's doing programming demos and then I have another one that's doing like general administrative assistant work or another one that's front facing, you know, I a model that's on our slack that people can just ask questions to based off of my institutional knowledge that I have of, of

39:35

our company and our industry. So that's the next phase and that's where the memory stuff has to come into play and the multi agent kind of orchestration and all these things are things that are being worked on now. So there's not like a clear winner or a clear understanding of what's what that looks like right now. But we're all kind of playing around with it. So I think that's kind of the next phase. And yeah, I look forward to coming

39:55

back. And I think that's probably part two of this conversation will be absolutely. What does that look like? What are these tools? How do we kind of build on top of this thing called openclaw or Hermes or all these other ones that are out these days. Yeah, that'd be awesome. And even if we just do a deep dive on like kind of what's exactly, you know, what's what. Because I know you mentioned a couple of things that maybe most of our listeners don't fully grok, right. Because

40:22

we have a lot of data engineers here too. Right. So and the other thing too that really came out was people would ask me questions about because we have something that Microsoft folks may know as TFAs or technical focus areas, call them pillars. So you're the agentic lead, I believe, and I'm the connecting models to data. Right.

⁠¶ Importance of data chunking

40:40

So the rag and that sort of thing and you know, a lot of the conversations I had was, you know, data engineering is more important now in AI systems than they were in the past because I don't know exactly how rag agentic systems would fail but, but when they fail, they probably fail very spectacularly.

41:04

But I know with, with rag systems, right, you know, if your data chunking strategy and your data kind of indexing strategy is not, I wouldn't say perfect because you'll never really get there, but appropriate to the data source documents that you're dealing with, you're not gonna. It's gonna fail in a way that is subtle and is only gonna amplify get worse down the road. Right. So you

41:26

really have to think through a lot of these things. Right. The, the one sentence I said most of all was, you know, chunking is an architectural decision. Yes. It's an important one. Treat it with that importance as opposed to just whatever, you know, paragraph by paragraph or blah, blah, blah, blah, blah. So that was other consistent theme. But I will say that the, the questions that I get are far more evolved than I haven't gotten at any other conferences in a while. I agree.

41:56

Especially this year. It's just a step up from where we were. So. Yeah. Cool. This is great. Thank you for having me on. No problem. We'd love to have you back. And since the recordings for the podcast, we'll let the music play. It.

Transcript source: Provided by creator in RSS feed: download file

Episode description

Transcript