From AI Hype to Real-World Results! Getting AI into Production with Sohrab Hosseini

00:00

Hi everyone, My name is Patrick Akeel and joining me today is Sura Hosseini, Co founder over at ORC, where they enable AI product teams to put AI solutions in production. And that's exactly what we talked about today, how to be in control in production with AI solutions with regards to observability, the specifics of Gen. AI with regards to models and rate limits and much, much more. Fascinating conversation in my opinion.

00:24

So enjoy from your perspective in this world of AI that we now live in, things escalated I think a year or two ago, specifically with regards to Gen. AI and Chatti BT kind of coming out to the public. We now have a lot of fluff, hype, potentially also really good things in production running. What for you is really valuable in this field of AI and AI product development and what is still kind of unexplored or more

00:52

fluff and hype territory? So what I see a lot of times is actually I, I will try to answer it in a different way than tell me if I didn't answer your question right, happy then to change. But it's a lot of laziness as well, right? Whenever you're reading up on stuff or new frameworks come out, new libraries and it always turns into what this is out with our framework, how you build and a lead generator or how you build an, a personalized e-mail

01:22

outreach. It's always those 2-3 same examples, which is actually intellectual laziness, right? And what I'm always trying to sift through is like a very product specific AI features that teams build and ship that is so specific to their product and not these boilerplate stuff like, Oh yeah, sales outreach agent or you know, an e-mail outreach agent. And everyone has that as an

01:47

example. So it's more about what is what I'm always looking for is in product teams, what is the strategic differentiator where we can infuse AI into our product that makes it truly different, right? Like because the type of data we have or the type of users we have or the type of workflow we support, what is the creative rethinking we can do around that to infuse that with AI?

02:16

That gets me going and excited. And whenever I read, oh, this is how we do ICP outreach or, you know, lead generation outreach. Yeah, then that's kind of a, a fluff for me. And another one is, you know, all these people on LinkedIn writing stuff and then you open their profile and you see like they're now an AI strategist, but strategist. But six months ago, they were still like 12 years of project manager and some boring bank, and now they're an AI

02:45

strategist. That's also like, yeah, is someone actually building and shipping or are they just regurgitating stuff that they found or they have just checked GPT, write some blogs for them to post on LinkedIn to become an influencer. Yeah. So it's either do you have the track record or are you doing something really creative with AI, right? Molding it in your favour instead of just the standard right to personalized e-mail outreach. Because my inbox is full of them, right?

03:16

Like my spam box at this at any given day is like around 150 called emails a day with yeah these AI generated crap that no one reads. It's a, it's a sad reality that we get there right. And I, I understand from let's say the perspective of someone that sees this out on the market that really wants to actually contribute, how do you differentiate yourself? That's probably a question they've asked themselves, but the answer is the same, right?

03:42

Yeah. Act like you're an expert, or start creating content or start playing around with it and actually see how you can deliver value. The easiest part is to act the part and not actually do the thing. Do the work, put in the effort to try and understand deliver value. Figure out use cases in your personal productivity, and then see what use cases can be solved in in the outside world. So maybe people take the easy Rd. And maybe just on what's fluff and what's not fluff, right?

04:08

So are people actually building and shipping or just talking about it, right? It's easy to filter those out. But also important one is whoever thinks AI is this magic pill you take and all of a sudden your life changes, right? Whoever makes that promise in this current day and age, right? I'm not saying five years down the road what AI will be able to do, but now it's never a magic pill.

04:30

Like you just turn it on and you're all you start getting leads and seals and you know, everything blows up. So you see that also often right where it's being sold as just take this magic pill and your whole life will be better after this. Just turn our on our AI sales agent and your whole funnel will be filled up with qualified leads. Yeah. So that's also important to see us fluff I. Feel like. Same people saying I'll take this pill and you'll lose weight.

04:56

Yeah, it's never the case, right? It's a it's a grind to really do something. I mean, it's always been a thing. The the example that you give, take this pill and you'll lose weight. Like that's what people want. Yeah. And that now is kind of out there with Ozempic. But we won't go in there like that. That's what people are looking for. And when they're there, they will take it. That's the easiest way. Well, even with an Ozempic, you need to change your lifestyle.

05:16

Yeah, you need to work out, you need to do strength training. So it's not like, oh, I'll take a shot and my life changes, right? So yeah, yeah, it's really experimenting, testing, making mistakes, right? It's almost like building a startup, like getting your organization going with AI. It's almost as building with a startup. You have idea, you test it, it works, it doesn't work. You roll it back, you change it, right. So yeah, it's never a magic pill

05:46

for anyone, I think. Have you seen any common use cases that have been put into production with regards to AI? Stuff that actually works, even though the data might be specific to an industry or an organization by using the tools in this way, like. That's a solid use case. Yeah, already right. A couple have been proven, which I also kind of believe so one is of course code, right, coding. We at this moment, right, like all our engineers have to be AI first. So we don't hire juniors

06:14

anymore. It's either matured Meteors or definitely seniors because at any given time they're orchestrating, you know, four or five coding agents that actually output high quality work. And our platform is really, really complex. So it's not like, oh, just, you know, create a landing page, but it still does a really good job and it improves our efficiency, right? So coding is already clear that that works. Customer support, right? That's clearly one that works.

06:44

It's also why those are always the examples that are being used because those are more matured and have proven that they actually work. Yeah. So like in big buckets, those are the two buckets that already prove value if implemented correctly. Right. So with coding, your people need to know what they're doing. The orchestrators of the agents need to know the architecture, the vision of the product so they can assess the quality of the output, right.

07:11

So you can't. Again, it's not a magic pill. You. I cannot as a non-technical founder, turn on five coding agents and then develop a business critical scalable platform. So the senior people orchestrating them need to know what they're doing. It's just that they now all of a sudden have these tireless team members that can output the, the work and they are the quality assurance for the architecture and scalability of the platform. Same with customer support.

07:39

It's not just, oh, we'll upload all our FAQs in a, in a chat bot and then it's solved, right, Because you will have knowledge gaps. The data is outdated. So you are, you do need to change your processes and change some roles of people and make other people responsible for maintaining the knowledge gaps. But how do you have visibility on the knowledge gaps of your AI chat bot?

07:58

And do you give the AI chat bot support bot access to different APIs so they can do a refund or, you know, do some trigger some specific processes? Yeah, all that you need to solve, right. Again, there's no magic pill that then does that for you. Yeah, I have seen people and especially now that MCP came out as kind of this new integration block that are afraid of doing that right, giving a model then access to refunding. And like I, I get it from an information gathering standpoint.

08:29

If I'm having a production issue and I have a multitude of systems, I would love to have one interface that goes to all the systems and it gives me a diagnosis. Yeah. And I can go back and forth with regards to one interface that just gets the information from all systems, but I'm in the driver's seat. Yeah. If I were to then be like, yeah, I'm a customer support engineer, and I can see issues with regards to an order and then an automatic refund kicks in.

08:51

And I was like, maybe that's not the right solution, but something else decided to do that. That's where people are a bit more fearful. No. And that's it's all about control, right? Do are you in control? Control of the costs of the performance of the data, of the actions that the AI is doing. So how do you give someone a sense of control? Right. The same way that a dashboard in a car has been invented over time because you as a driver, it is proven.

09:19

OK, I need to have understand how fast am I going? How's my engine doing? Do I have my lights on or not? Right. There's some information you need. And that's why a dashboard has evolved into what it is. Yeah. We're now in this change. Where people start putting AI in production and then they realize, oh, I don't have control, right? Or I'm not in control or this use case sounds cool, right? Giving refunds, but I need to be in control. We need to have auditability.

09:45

There needs to be maybe a human in the loop. But if the human is in the loop, the human needs to have the information to say yes or no, right? Can I just say, hey, can I make a refund? Because seven questions will come after that. Like to for what? What's the reasoning, right? So here's that. Here's my information why I think this person needs a refund. Do you think? Right.

10:04

So an entirely new what we call agent control tower, but it's like human agent interface now needs to be invented, right? How does that look like air traffic control tower in a, in a, in a airport, right? Like, how does the air traffic controller feel in power, in control, that they're not going to make mistakes and create accidents? Well, that is now organizations are going to need these agent control towers where an individual can orchestrate a

10:35

swarm of agents. And it's different per use case, right? But it's like the agents are going to get stuck. They're going to throw exceptions, they're going to ask for approvals. But how do you feed that to the human? So first of all, they're not flooded with the information because they cannot read 20,000 logs per second. And yeah, so you have the just in time and management by

10:56

exception principles, right. Everything has already been invented in, in management and business administration is like, what's the just in time flow you need? What is the management by exception flows you need? So an orchestrator can do their work. Yeah, that's now the phase we're in where now the the next stage is starting to become uncomfortable for people, forcing us to invent new concepts. Yeah, I want to get into like the maturity of organizations.

11:27

What maturity you want to have as an organization before you go there. But before we do, let's touch on a little bit of what you're doing with ORC, because I think we already hit some of the surface there. Oh sure. So at ORC we we provide an AI engineering platform. So it's a SAS that we build it, we maintain it for teams where engineering teams and product teams who have a vision, right, who who want to infuse either existing or new products with generative AI in a scalable way.

11:58

So they use our platform and middleware in there to develop hypotheses, test and experiment with their hypotheses, evaluate how their use cases are going to perform next, put it in production, but in production to do granular rollouts, Canary releases, AB testing. So that's you know what whatever AI feature you have shipped in production acts and works the way you intended it again, control and then observability monitoring evaluations needed there.

12:29

And then finally the human in the loop, right. So the product managers, domain experts, right? They're not all Python engineers. How can they be involved in this whole process to give feedback, to make corrections, to identify hallucinations? Now teams are patching this together, either with some random open source libraries or they build their own solutions. Well, we bring this as a fully managed, well integrated

12:58

platform. So the teams can just focus on building their features instead of maintaining this whole enablement layer underneath. Yeah. Because one thing teams never have enough of is resources, right? So the whoever they can get on board and the right AI engineers, they can find, you want them to be working on your features and not maintaining, let's say, the pipelines under the ground. So that's where we then come in. Yeah.

13:23

And then finally, because of our AI gauge where we're integrated with all the hyper scalers, all the cloud providers, all the model providers and we unify that into a single API. So teams first of all are not locked into a single provider and they can actually operate multi cloud out-of-the-box. And and now with all the geopolitical stuff and data residency concerns, they can even decide like which providers do we use for which entity, in which region, etcetera.

13:52

Again, control, control, control that they need. Yeah. I think that's mainly like if I look at software development life cycle, we have strong conventions and those conventions can still differ in organizations, but the tooling is there to kind of adhere to those conventions to put them in place or we don't reinvent the wheel there necessarily. And from what I hear you're saying, you're trying to be one of the tools in the tool belt

14:14

there. And a lot of people are trying to figure out, do I build my own or do I indeed grab something off the shelf, which I think validly. So with Gen. AI in production, we've had I think the most out of control situations in production, right? Because I've had many demos where people tell me, this is our product. We do XY and Z and then something happens and they're like, well, it's still generative AI. So there is a level of

14:34

hallucination. That's not what I want to hear, especially if I'm buying a product like I want, I want guarantees. I do want that level of control. I don't want someone to say, well, it's a thing and it just hallucinates, so I fully understand that people are looking for that. Yeah, of course. You know, you want to make the margin of error as small as possible. Yeah, but if you had humans doing that job, you would not be shocked if one of the employees

14:56

would make a mistake, right? You would not hold them accountable for 100% perfection. Yeah, most of the time depends on which field, but most fields you know, you're not going to fire your people if they make, you know, one in 1000 errors, right. But you, you do have that kind of tendency to when it comes to AI or self driving cars as well, right? Like how many people are getting killed by normal cars versus self driving cars. But still, right, 1 is already too many.

15:22

So that's kind of also a mind shift that needs to happen. But indeed, in in software delivery, as you mentioned, a lot of the concepts are, you know, decided on, right. You need code control, you need pipelines, you need CICD, you need a regression testing. No one questions that. And then either even some of those you buy or build. But now you're starting adding these little black boxes of unpredictability in your software.

15:48

Yeah, right. That behind the scenes even get updated without you knowing it, right. Like open AI behind and they can just update their model behind an API and you would not even know it. And all of a sudden your use case starts acting differently. So the life cycle management, which is now not being really discussed becomes a problem. Everyone is already happy if they get Apoc out the door that they can demo to management and

16:12

get some applauses. But no one has thought about, OK, well, how do we put it in production and how do we get into then the continuous delivery and improvement flow, right. You also would as a with your background in the product management, etcetera, you know, like it's getting it out there the first time. That's when the the work actually starts because then you start getting feedback and learnings and edge cases and

16:32

weird behaviors. So how do you set yourself up for the cycles of learning insights, changing your hypothesis, testing them, putting it on my back in production? And now what you see with teams is then they get totally stuck, right? Everyone start trembling over each other and tripping and Excel sheets with the color-coded corrections getting emailed back and forward, the engineers going crazy because it's not, you know, solidified yet the process.

16:58

So that's also where we try to come in is giving you that end to end process. So you can do this continuous delivery almost like CICD for Gen. AI right? Yeah. Is that then what the a resilient production ready solution looks like with regards to an in control state. So I deploy something I have insights with regards to, is it guard reels, is it boundaries, is it performance? Yeah, everything with regards to. Observability definitely. I mean again related back to

17:26

normal software, right? You will check how fast is it working, what's the latency? Why is this specific call taking so long? Oh, I don't know. We have some, you know, we don't have an index in the database for this one. That's why it's so slow. We need to improve it. You will have the same analogous challenges with Gen. AI, right? Like why is this model getting slow over time? Why are we hitting our rate limits? Do we need to do and do orchestration, retries,

17:51

fallbacks? The models are not up 99.999%. So how do you build in resiliency Cost is a compared to normal traditional AP is it's it's very different than Gen. AI, right? You can literally get yourself into negative margins by building the wrong AI feature. That was one of we were speaking to a prospect who they do it like a billion revenue a year and they were considering to and they build APOC of a feature that does some really cool

18:22

recommendations. But he was like, if we would turn this on in production and half of our users would actually use it, Yeah. Then we would spend 500 million a year on AI tokens, which is half of your revenue. Yeah, just on that one feature, which is not even a killer feature that will change your life as a either user or as a business. But it was recommendation half of their revenue, right. So cost is, for example, something that's really important and who is responsible

18:53

for cost? In a lot of matured organizations, the product manager, right? They own the PNL of their product or their features. So they need to know, hey, how much is it costing me to ship this feature versus what is it getting us? So again, control right from multiple dimensions you need. Interesting. Yeah, I feel like a lot of people are seeing where they can deliver value indeed. But then if it's worth it, that's a whole another question,

19:20

right? I've seen many proof of concepts with regards to, I mean, the environment that I used to be in, which is a bank because it's a really cool technology and we don't want to be behind. So we're going to do proof of concepts to see where we can deliver value. And if it's a yes, then usually we should look into scaling things. But then through that you also can congest the system because only a few things can have support from the right engineers and actually make it to

19:42

production. And if you just go wide with regards to use cases, you don't actually see it. OK, everything has value, but what is the cost and what is the business case? In the end of the day, I feel like there is a lot of value in reducing manual labour where you just have documents, PDFs and people are going from that to filling it in. In the systems I've seen hundreds of pages. Manually analysed to be then put into a system. I feel like there is like a very

20:08

big use case. The sooner you can have those data points kind of automatically pre filled it human in the loop with regards to kind of an assessment for I principle as you will. If the first two is are AI and then go through like you can increase productivity in that way and that is more of a cost reduction exercise. But for me, that's a very tangible use case.

20:29

Yes, and at the same time, but that's like my strategy consultant had on immediately will be like, but was that process needed in the 1st place, right. So sometimes even just blank campus, blank slate, rethinking the whole process of why, why, how did we end up in this situation with this workflow and then see what we can do with AI instead of just trying to patch it with AI Because yeah, you're just trying to fix a shitty process to begin with, right?

20:56

Like, like we're now in going through some funding processes at the moment and then I'm starting being confronted and I'm super allergic by all these inefficient processes, right? Like we literally, I have to drive to one note tree to pick up a physical book, yeah, of the shareholders and then drive it to a different note tree and they need that booklet. We need to go everywhere with our IDs, my Co, Vander and me to identify ourselves in person even though they have everything

21:29

about us already. Endless amount of KYC forms I need to fill in. Even though the Chamber of Commerce has all that information, like I'm registered as the, you know, the director. But then I'd still need to manually fill in some random PDF with all those. So it's like, OK, I can automate that with AI, but isn't it a stupid process to begin with? I like you as a no tree. Why don't you just pull that from the Chamber of Commerce?

21:53

Why do you want me to download it from the Chamber of Commerce and print it out and then fill it in by hand and then post it to you? Right. It's like the whole process sucks to begin with. Yeah, I mean, if people come to ORC then specifically or to you with regards to a use case, and we want to use ORC in production because we think we can be in control, Do you then also challenge those use cases? Because I think from your, if I were your customer, for me that'd be incredibly valuable,

22:18

right? AII see this, my competitors are doing this. I have the feeling I'm behind. I want to do something resilient in production. Think you're a great fit. This is our use case. And if you then say, well, actually do you need to have that in the 1st place, that would be incredibly valuable, yeah. We do, although we have great consultancy and strategy partners who do that in the more complex projects already. So they go and do that.

22:45

So we come in with the technology after the second is who are not our buyers are teams that are thinking about starting with Gen. AI, right? Because then we're trying to sell them a Formula One car when they don't even know if they need a bike or, or, or a scooter or a car. So the teams that knock on our doors and you know, we start working with a lot of times have done multiple projects, have had some bloody noses, have had some

23:14

hypotheses killed already. So it's also a different dynamic versus we are thinking about AI. What about you guys? That's not a fit. No, because they don't know even what they're looking at. When then we show the platform, they're like the buttons work and the guys are nice, but we don't know what to do with this piece of technology, right? Is that then also kind of the maturity progress an

23:39

organization goes through? They see AI, they start experimenting small, and then they see kind of what it means with regards to use cases before they put it into production in a controlled manner. Yeah, and they see like the projects getting slowed down when going into production or not making production at all or in production getting some bloody noses of things going wrong. And then when they see other platform, they're like, oh, shit, we needed this six months ago.

24:03

But when we would show them the six months before, they would not recognize the the value of it, right. So they understand the concepts that we then explain, Oh yeah, we need the experimentation, we need observability. But you know, once you've had a couple of those bloody noses, then you're like, oh, I really need this. So that's part of the maturity curve. And we, yeah, then we always are like, hey, let's check back in

24:24

six months or let us help you. But yeah, you're you're not going to get the value out of the platform. Just are starting and thinking about AII. Feel like that's like it's it's interesting that organizations need to experiment, fail and learn in this kind of manner because it's so accessible and people can try out and people can kind of see in their own domain what would be valuable and what would not be valuable and actually start doing that.

24:49

It's very different from let's say data-driven solutions. With machine learning, you needed to have a full infrastructure layer and histories of data beforehand. And now you just generate stuff based on what you already have. Or you tweak your content. If you're you have a marketing tool and you personalize it based on demographic data that

25:06

you have. I feel like people can get up and running faster and then this cycle of iteration of failing, getting feedback and improving, they can do that on their own nowadays. Yeah, because it's a lower a hurdle of entry, right. Generative because you just use natural language. Everyone can use natural language to communicate what most people can, right?

25:27

Yeah. You didn't have that in ML, in the, you know, in traditional ML because there you need like, you know, researchers and PhDs and data scientists and labeled data to get a feature out here. It's like, yeah, everyone uses NACHA GPT. And if you yeah, if you just get the concepts, a lot of people can already start working with generative AI, which also brings his own challenges because it's so simple. People underestimate what you need for business critical features in production, right?

25:57

Someone that says, oh, I use Jet GPT, so we can also put this in production. No, that's not how it works, right? Then that's why also requires an entirely different type of stack than traditional ML needs. I mean that, I think for me explains why there's so many. Like the opinions are not very coherent.

26:16

It goes all over the place. And I feel like what we see out there, especially on LinkedIn, but definitely also on Twitter with regards to different types of thinking and saying, OK, non engineers can definitely build products. And like it's, it's so out there and people are either trying to sell something, they might be bots. I'm naive as hell and I just see this happening.

26:34

I'm like, it's actually I, I don't think it's ever been this messy with regards to different signals that someone can pick up one individual and then that helps form their own opinion. And every cycle you do have these snake oil people, right, trying to sell snake oil. So I mean, crypto time, right then when blockchain legit technology, but then you had all these crypto crap being built or hustled on top of it.

27:01

Yeah, this is kind of the new era again, right, Where you have these people leeching on the people who don't know and just trying to, yeah, scam them in a way. Yeah, if someone is. By my e-book. Yeah, it'll a refund, but you don't get a refund. If someone is listening and they they hear this and they are maybe familiar with AI, what would you recommend to them to actually start getting a fundamental knowledge with regards to learning what it is, what is out there for their own

27:30

organization? Start experimenting. How would they start? Well, the most important one, of course, there's so many resources outside. I mean, you can literally use Chet GPT to get your knowledge up to speed, right? But to really understand in which fields this technology can actually help and which ones it cannot, right? Because whatever question you ask it, it will answer yes. It doesn't mean yeah, right? It doesn't mean it's correct. So people think, well, we'll

28:02

have AI fix it, right? So yeah, let's plug in AI. So first of all, it's used as this blanket statement of well, AI will fix it, which is not the case. Is really you need to be in the engine room, you know, bloody noses, dirty hands to really get a feature properly working. But also understand the boundaries of, OK, you know, processing natural language, classifying text, all that stuff. It's great for.

28:26

But you know, you cannot just have it run your accounting in Excel without, you know, with just a prompt. So really understanding where, how are these models built? So what are their strengths, but also what are their weaknesses? So let's double down on the strengths, you know, and map it to what we need as an organization. But also let's not fool ourselves by having a use case. What is actually a weakness of an LLM model, right? Thinking that again, it's a magic pill that will solve our

28:58

accounting processes. You have AI start-ups building stuff for accounting, but then they're building very custom solutions for that. But you cannot just, you know, send an Excel sheet with all your statement bank statements to a model and then have it, you know, turn out your whole accounting and and your reports and balance sheets properly. Gotcha. Yeah. So the boundaries of the capabilities is important. Yeah, figuring those, I mean, because it's so accessible, you

29:29

can do a lot of experimentation. And I do think with a sceptical mind you can find those boundaries, even though everywhere on the Internet sometimes makes it seem that everything is possible and it's just this magic. It definitely feels magical, I must say. I've read a little bit about what goes under the hood, but from a user perspective, it's like, yeah, it's, it feels accurate, it feels natural. I've used voice assistants.

29:53

They are incredible. Like it just keeps getting better and better, but critical thinking will always be necessary. And then you have all these non functional things as well, right, like rate limits and token usage and all that stuff. These are all non functionals. But we have now clients who have very successful features, but now they're getting crushed by all these rate limits. Yeah, that currently is right. So that's a limitation of the

30:17

technology at this moment. You cannot, you know, service a million of your users with one API key of one provider. You will hit rate limits, right? It's the same, you know, I always say history. Well, I didn't invent the same, but right, History doesn't repeat, but it rhymes. During the whole blockchain hype, I was like, yeah, this can be our centralized database or we're going to, in our company,

30:39

use blockchain to store info. Well, and then you had the problem of the speed of these databases that you cannot use it as a database for software because it's just too slow, right. That blockchain. And why would we, we have a database in our company, right? Why would we put up a blockchain? So the the technology being misused for the wrong reasons is is inherent also allows you to allows us as as mankind to find

31:05

new use cases for technology. But you also did not, should not be afraid to write stuff off because it didn't yield to the results you wanted. Linking back to the experimentation mindset you need.

31:18

Yeah. One of the operational questions I had was indeed with regards to rate limits because I've, I've been in organizations and I've put Gen. EI solutions to production, but rate limits due to our size, internal application not reaching a certain level of skill, not being an app that goes to consumers. I feel like if you reach a certain skills, rate limits are definitely going to be an issue. Do you then go and you switch to a different model or how do you accommodate for rate limits?

31:44

Do you really just have to swipe the credit card and pay more? No. So yes, even that is sometimes not possible, right? So they're making reserved deals with the providers, but then you really need to have big commitments. So as a small startup, you're not going to like commit to 200 KA year to Google to use to get dedicated space. Also because you don't know if you want to be on that model in in two years, maybe a better one, cheaper one, faster one lock come out.

32:13

Yeah, right. So that is one of the reasons. So what we then do is these fall back orchestration, right? So in org, for example, people can have yeah, up to five now, but it can be 10 if you want fall backs. So if you so you're load balancing your rate limits across multiple data centers and multiple vendors even, right? So because a lot of the popular models are provided by multiple providers, right? So Entropic models, you have Entropic, but they also run on AWS and on GCP.

32:45

So if you put those set up right, like if you use Cloud 4 Opus and you have both, you know, all three providers, you kind of have an aggregate of the of the rate limits. It's it's the same model. So you don't have a quality difference, right. So that's the different ways you can orchestrate around it. Or you can even decide like as a fourth fall back, we will electric fall back to open AI then, which is maybe a different quality or not as good, but at

33:14

least we're not down, right. So as a product team, you're constantly making these trade-offs. How how good does it need to be? How fast does it need to be? How cheap does it need to be? And per use case, right? It's not even per clients, but per use case. You make different trade-offs. If you're building a customer facing chat bots, yeah, you, you cannot wait 25 seconds for every answer. So then speed is of importance.

33:39

But if you're running some background process, you don't care if it's three seconds or 30 seconds, as long as it's proper processed properly. So then you're optimizing for correctness, right? And as a product team, you're always making trade-offs. You do that also in software, right? Like, do we want it to be fast? Do we want to be reliable? Do we need to have multiple zone availabilities? Yeah, you're making all these trade-offs constantly. Yeah, I like the saying.

34:03

There's no good or bad decisions, there's only trade-offs, Yeah, and that's basically it. But I love this solution of there's multiple cloud providers hosting these same models and then having kind of availability across that. Yeah, to kind of up your rate limit. I feel like it's genius to be. Honest. Yeah, well, for example with Azure then, right, You, you can literally deploy a model in different regions. Yeah. So you actually have the regions as each other's fall backs.

34:27

So you have now all of a sudden like your whole European rate limit of all the different Azure data centers combined? Yeah, that's really smart. I was only thinking along one kind of axis, which would be switching models. But switching models, I think for content, and this is, I mean conversion optimization always goes to the nitty gritty.

34:48

But for content, if you have a marketing tool and that does personalized content based on user demographics, one model might outperform another one, but it should be negligible because we're talking about content here specifically. So switching a model then is not going to be the end of the world. But I like that you have the option to keep the same models consistent and then still manoeuvre around those rate limits that you have. Yeah. And then content, I agree.

35:12

And even there, right, some people like Claude's creative writing versus some other language differences, right. So because we always think in English. But yeah, if you have a friend we have literally with our contextual router, you can do contextual prompting. You can literally say, hey, the French users will send to Mistral, the English will send to GPT, and the Dutch will send to Anthropic because the Dutch localization is better in that language. And Indian we send to an Indian

35:43

LLM. Well, Indian is not a language, but Indian languages will send to this specific LLM for for localization. Yeah, right. You all of a sudden need that. Yeah. I like, I like that a lot to be honest. Like for me, I, I thought, OK, getting stuff in production, that's one thing. Having kind of the, the operations mindset with regards to observability guardrails, seeing where things kind of go out of the boundaries that we think we have is one thing. But I've never thought of this.

36:13

I mean I have thought of this but I I never thought you could go that deep with regards to using and leveraging different models in the same use case, right? It makes sense to have one use case and then one model will pick the best model. But your use case might vary so much that you can have multi model solutions based on locale like you mentioned with regards to different writing styles.

36:31

Even data residency, right? So we have clients that like different business units, the data is not allowed to leave their region. So they will literally route use cases to an LLM deployed in a specific region, right? And they need that orchestration all of a sudden so it can go really crazy. That's because data residency in AI becomes more and more of a criteria in every decision. It's definitely a hot topic. Right. Yeah, yeah, right. Where does the model run?

37:01

Is literally a thing. And then we now have Australian healthcare companies. It cannot literally leave the boundaries of Australia, right? And India has the same type of laws and regulations. Well, GDPR, we all know, right? It's all like, like physically it's not allowed to leave. So how do you guarantee that and make sure that people have that granular control? And of course, if you're just summarizing some SEO marketing text, you don't really care whether you send it to the US or not.

37:27

But if you like summarizing healthcare records, which clients of US do in healthcare, yeah, it literally physically is not allowed to leave the boundaries of the EU, right? Yeah. So you get those type of a non functional requirements. Yeah. As a last thought, I wanted to ask you this question because I see a lot of companies that are starting up and trying to do something with regards to Gen. AI, right, Trying to find product market fit, either B to BB to C, anything they can find.

37:52

What would be your advice with regards to the challenges they're going to face or what to look out for when they're trying to go on that path? Find the right people and what I mean by that is try to stay away from all these snake oil sales people, right? Because they will apply to jobs, especially if you put up an AI job ad now the the top, top tier people will not apply. But then you attract all these.

38:19

Well, not everyone applying, but you know, it's a higher risk that you attract also some wannabes and because companies are so desperate, especially if you have difficulty attracting the right talent, you're not a tech first, AI first company, then you're like, OK, I hope this version is the magic pill, right? Then they sell themselves really well. And then you would think, oh, they're going to help us with our AI strategy and it won't.

38:45

You as a leader also need to be in the trenches and get your hands dirty. So attracting the right people and then just starting, right, just start developing hypothesis and testing and set yourself up for that continuous testing as well. But that comes maybe in Step 2. Step 1 is just get going with the right people. Gotcha. Now how, how do you do that? Do you find all the people through your network?

39:10

You already mentioned juniors. In the phase that you're in is not really what you're looking for. Yeah, we have. I can remember last time we put out a job ad, it's always so our best top tier performers we have in the company, they just walked into the office like, guys, I want to work here, I don't care what. And as far as they also had yeah, yeah, yeah, or yeah, or through network, they make sure they get an intro.

39:33

So it's always been people applying and not even like, hey, here's my resume, but more like let's talk. And then it matches. Also because we're not recruiting 10s of hundreds of people, so we only need a couple really good ones and we're in a sexy space at the moment. So yeah, we people come to us, Yeah. In that front, yeah. Even though you don't have a job position open, would people still come and you would still

40:00

hire those people? Right, talent, we make it work also because we're always looking for, you know, what we call M profiles, people that have broad interests and capabilities and actually can go deep on a couple of fronts. Gotcha. Right. On AI or on engineering, but they're also just mini CEOs on their own. Yeah, right. Most of our people are from day one have been saying one day I'm going to have my own company. I'm here to learn and I'm

40:26

totally supporting that. I would love to, you know, be their Angel investor one day. So yeah, that's the type we're attracting and working with. It's all these mini future founders themselves. That's awesome. Thank you so much for coming on and sharing. So this was amazing. I I love that it was a lot of fun. I must say it was good. All right then we're going to round it off here. Thank you so much. The best way to support the show

40:51

is to leave a like. Whether you're on Spotify, apple pie cast or on YouTube doesn't really matter. Leave a like anywhere and we'll see you on the next one.

Transcript source: Provided by creator in RSS feed: download file

Episode description

Transcript