The future of gitops and platform engineering | Argo Unpacked Ep. #17 | Argo Unpacked podcast

⁠¶ Episode Introduction and Project News

00:00

What is the future of GitOps and platform engineering and where does Argo fit in? Plus, why you shouldn't be using parameter overrides in Argo CD applications. All today on Argo Unpacked. Welcome to another... wonderful episode episode 17. today we're going to be joined by michael crenshaw who's the argo cd project lead very excited to bring him in we're going to do that in a minute and of course my co-host ruby tal who will be joining us but before i bring them in

00:29

we've got a couple of things going on in the Argo project that we want to cover and talk about. And we're actually going to talk about more of that news a little bit later, but obviously Argo CD 3.2. coming out rc is coming out uh you're gonna be wanting to try that there's a couple of new features that are coming on plus a nasty bug that we were able to fix before it became an issue for the users a lot of people don't realize how many things we get fixed

00:54

before the project new release comes out. Obviously a lot going on in the security perspective. And of course, next week is ArgoCon and KubeCon. Going to be happening in Atlanta, and I'm looking forward to meeting a lot of you there. So hopefully you'll tune in and talk to us. Now, before I bring in Michael to talk to us.

01:14

And we're going to talk about the future of GitOps and platform engineering. And there is a lot going on in that space, a lot happening with environment promotion, application promotion. There's a lot that people are asking from the community. So we're going to get into it. But before we get into that, I want to share an Argo tip this week.

⁠¶ Argo Tip: Avoiding Parameter Overrides

01:29

about using parameter overrides in argocd a lot of people are doing it wrong today and they may not necessarily realize it so What are parameter overrides? In Argo CD, you define a source of truth, usually in Git, you define a destination, usually a Kubernetes cluster in a namespace. In fact, always a Kubernetes cluster in a namespace. And then you can set different parameters for your application. Sync parameters like I want this. They're basically sync policy. Sorry. You set sync policy.

02:01

as well. So auto sync or manual sync, how you want it to be diffed. There's all these kinds of little parameters. And then there are parameter overrides. So if you're using something like a Helm application, you can specify values in the application manifest. that you would like to go into the manifest generation of that Helm chart. So what does that mean? Well, it means that when you create your Helm chart, you always have a values file. So the chart is a template.

02:29

your values file is defining what values you want to use in that template and the way that most people do it is they have their helm chart either in a a separate helm repository or sitting inside of a git folder and then they have values file that's sitting in their git repo that specifies all of the ways that that all the values that should go into that templating experience and when you run

02:53

an Argo CD application, it's going to run the Helm template command to generate raw manifests that it can use to apply to Kubernetes and then do diffing and handle all of that diffing work. Now we actually recommend that you don't use parameter overrides on the application because it creates a split source of truth, a split confusing source of truth that you have.

03:21

If you have the values set in Git, then you can do a diff on those values. You can set branch controls. You can set all kinds of parameters for how those values are going to be handled. But when you set those values as part of the Argo application, while it's very convenient to do that when you create the application, it also means that when you go to render your Helm chart locally,

03:47

you now no longer have a way to do that. So let me say that again. If you go to render that Helm chart locally, run the Helm template command. those values that are part of the override of the application are no longer available to you. You have to go and manually create those, pull them out, create those values, and then you can regenerate exactly what Argo City is doing. So it makes...

04:10

It makes it difficult to actually replicate what's going on on that Argo CD application in the configuration that you're using. It also makes it difficult to handle promotions. Inside of GitOps Promoter, or I should say inside of GitOps Cloud, the GitOps Promotion Engine, it works off of the source manifest of a repository. And inside of...

04:37

GitOps promoter from the project from Intuit that Michael's been working on, it also works on the source of the Git repository. And so if you need to modify the application, you would need to do some extra work and parameterization. uh at some other level of abstraction and maybe mike will correct me if he comes on when he comes on but you'll have to do that extra work to make that work uh with parameter overrides so it makes it difficult to replicate exactly what's going on in your application

05:06

It makes it difficult to do promotions. And it causes all kinds of issues when you go to actually scale and use Argo CD at a higher scale. It means that when you look at Git, you can't just... reboot strap your application from what's available in the manifest and all of the get controls and parameters, they're no longer available as well. So it's a really good idea to not use those parameter overrides and instead stick those things in a values file.

05:32

and throw them in your Git repo so you don't have to worry about it. So that's advice number one. Now there is a little bit of a tweak that I'll add. If you're using application sets, That is a way of programmatically generating applications. When you generate those applications, it's very easy from the generators that you're using. Let's say you're using a cluster generator.

05:57

or you're using a git generator you can pass parameters from that generator in as overrides in your template and that's kind of that's an area where it makes more sense because you're saving so much time by using an application set because you're generating these applications dynamically that you would actually take advantage of the parameter overrides there however i would challenge you

06:24

to go and look at your templating and look at each of the parameter overrides that you're doing an application set and ask if it wouldn't be better if they were sitting in the git repo alongside the manifests that you have would it not be better to have a values file there because often it is and so we do challenge people to see if they can move more of those things into their manifest git repository because they'll have a better experience with

06:50

generally everything GitOps, but there is definitely a use case for why you would want to use parameter overrides as part of an application set. So your Argo tip of the week is... You're probably abusing parameter overrides in Argo CD applications. You should probably stop. We'll throw a blog post into the links.

07:10

We have one up at codefresh.io slash blog that Coast has put together on 30 different anti-patterns that we see. And one of the patterns that we see is the abuse of parameter overrides. And it does cause issues, especially when you go back to... reboot strap your applications from the manifest that you have in a git repo that source of truth is now split between different things so it becomes a problem so that's your tip of the week

⁠¶ Argo Unpacked Milestones and Demo Attempt

07:36

Stop abusing parameter overrides. You'll be happier for it. Of course, we've got a lot of other great news to celebrate right now, this week. I don't know if you realize. But Argo Unpacked just passed a thousand subscribers on YouTube. So we are getting popular. We appreciate it. If you like the show, if you like the podcast. like and subscribe, share it with your friends, post out your favorite clips, and even just make a comment. Let us know what you liked about it.

08:03

Let us know if there's something you want us to cover in future episodes. We'll be watching the chat as we go through today. I see there's already some conversation going on in there. um laurent is saying maybe you should show an example of what a parameter override looks like because people might not realize what you're talking about this is mostly an audio show so we don't do too much of showing but um

08:26

Maybe I can pull up an Argo instance here really quick and show you what I'm talking about if I can get logged in. I don't know. I didn't set up to do it, so I don't want to spend too much time on it. But if I go to create here, let me, we'll do it. We'll do it right now because Laurent said he wants us to. We're going to do it very briefly. And even if you're listening on the audio, it shouldn't be too.

08:56

too funky so i'm going to grab my example application repository so i've got my argo instance up I'm just going to use the default project. I'm going to leave it on manual sync because I'm not even going to sync it in the end. And I'm going to set my path here to... I should have one sitting. Actually, probably want this on. Get my path link up here. So I'm going to grab a Helm chart. Copy my path.

09:35

boom okay so i've got my uh helm chart here and um it should pop up with my it looks like it's not refreshing here so it's not it's not showing my uh It's not showing the parameters that are part of this Helm chart yet. I'd have to fiddle with it. So yeah, maybe I shouldn't have tried to YOLO do it. Gosh darn it. You tricked me.

10:01

Laurent to try to go do it without having looked at it for a second beforehand and I don't want to spend the time on it. But basically, there'll be an option here that shows up to do parameter overrides. And if I go to...

10:15

Chair overrides. If I go to the documentation, we can just look at it too even. It'd be an easier way to do it. So yeah, you can also set them as part of the... the cli when you're when you're setting values and you can set them also for customize right you can override specific values that you want to set um but basically all of the values that are available in a helm chart values file will show up as available parameter overrides

10:47

uh so once those are there people will start filling them in and putting stuff in and of course if you do that you will be be in all of the issues that we talked about earlier so anyway there you go that's a that's a basic outline of of the um parameter overrides thanks laurent for for for convincing me to try to go and look at it live um so

⁠¶ Welcoming Guests and ArgoCon Preview

11:15

Before we get into, we're going to bring on, we're going to bring on Michael and Ruby Tal, but celebrating the fact that we've got a thousand subscribers. Again, go share the show. We appreciate it. And let us know what you'd like to hear about in the future. So let me bring in Rivital and Michael. So Rivital is the co-host of Argo Unpacked. Welcome. Hey.

11:40

And Michael Crenshaw is the Argo City project lead and has been for, I want to say, almost two years now, Michael? Somewhere around there. Yeah. Michael... got the job because he showed up and was the most prolific of any maintainer anybody had ever seen um he has an uncanny ability to review 50 or 60 pull requests in a week and uh

12:05

I don't know how he does it because it's not like if I tried to review 50 or 60 pull requests, I'd be like, looks good to me. Looks good to me. Looks good to me. But you go through them line by line and you give really good feedback for people and you challenge them.

12:18

uh when they've maybe built something in a funky way and you've been a really good guardian and steward of the project and we really appreciate you taking the lead role uh yeah about two years ago and uh still going strong so thank you michael Yeah. Thank you for saying that. Shout out to Intuit for paying my bills so that I can actually do all that work. And as a side note, I have two PRs. So if you can, after the show, take a look, that would be fantastic.

12:45

Yeah. Speaking of, get to work over here. So first of all, Revital, we've got KubeCon coming up next week. ArgoCon in Atlanta. I've never been to Atlanta. I don't think you have either. I haven't. Michael, are you familiar with Atlanta? Do you have the inside scoop there? Well, I've been there a couple times, but not exactly the inside scoop. I know they like peaches. Okay. Peaches. All right. We'll look for peaches.

⁠¶ ArgoCon Sessions and Hallway Track

13:13

So for me, it's the second time I'm going to attend ArgoCon. And I was actually looking at our previous episodes, episode number three. We did a whole episode about ArgoCon and KubeCon. And I think it was a perfect preparation for me. I was attending the first time. I was overwhelmed. I felt a lot of FOMO all the time because there were many sessions. Each session is 10 minutes long. And the tips that I got from the, I'll share, I'll throw the link. There we go.

13:56

i think that the tips i got were amazing because the fomo feeling was less stressful and i know that everything was recorded so i could go ahead and and check out whatever i've missed But one thing that I remember is when I saw the agenda and I saw that every session is 10 minutes only, I looked at the first ArgoCon. I think it was 2021. Unless I'm wrong. The sessions back then were 40 minutes, at least 40 minutes each session.

14:31

what happened how come we change them and i know what happened right but it's really interesting to see that from one hour one hour and a half 40 minutes was the minimum session to 10 minutes of sessions it's like i see you know how we scroll down no we we still have longer sessions but like the normal sessions are 30 minutes 25 minutes i think yeah

14:59

Yeah, 25. So they're not 10. They're enlightening talks. Yeah, but it's so, so different, right? What does it mean? What does it tell about us as human beings? You're worried that we've TikTokified ArgoCon? I'm not worried. We have, right? No, I think it was really... We did used to have some... 35 or 40 minute sessions and now we do uh what 20 25 minutes is like the standard session length now at argocon and we just we just found that um

15:40

With a lot of those, there was more Q&A at the end. And sometimes there wasn't that much Q&A. Sometimes it was like, oh, that was great. Thank you. And then we'd kind of have a little bit of dead space. And at the same time. a lot of the people that want to do q a want to go and talk to the speaker kind of one-on-one and explain their scenario and so we found that 25 minutes worked because

16:02

Then when they went to the Q&A, the speaker could just go off the stage, step out. And if people needed to talk to them, they can go and talk to them. And then we could cover more topics, right? We could fit an extra. So like teasers, small teasers. And if you want to go deep dive, you go ahead and chat with the person who gave the talk. And that was actually one of your tips. It's like, yeah, we have a lot of sessions, which is great.

16:27

The in-between sessions, the conversations in between sessions, these are the most interesting discussions. Yeah, we call it the hallway track. Yes. Yeah. The hallway track is always the best track because that's where you talk to, like, there'll be people that are going to be presenting at ArgoCon. I actually should have, we should have pulled up the list of presenters and the companies and stuff, because there's a lot of great companies and end users and stuff that are going to be presenting.

16:50

But those are all people that have approval to talk about the stuff they're talking about, right? But in the hallway track, you get conversations from people who don't have approval to talk about what they're talking about.

17:03

some of those are really fun and i remember one uh the very first docker con and this is a story i always tell michaels i'm sure you've heard it now a dozen times but we were sitting there talking it was the night the docker con was was held the night before the election in 26 uh 2016. Yeah, 2016. So real big deal election to set the stage.

17:28

And we're talking to some folks from USA Today. I don't think they would mind if I told this story now. And we were like, oh, how are you going? Are you adopting Kubernetes? And they were like, we pushed everything to production yesterday for Kubernetes.

17:42

we switched everything to Kubernetes. And we're like, you did it two days before your biggest news day of... of the of the four years and they were like yeah yolo we were gonna make it happen so that was like a really cool and we got to talk to them about their implementation and what they were thinking of and they're they're they're like that right there that story

18:00

in the moment of 2016, told us more about the pull of Kubernetes than all the talks we'd seen all day, right? So the hallway track ends up being awesome. I don't know, Michael, do you have any awesome hallway experiences? that you're not under NDA to share? Oh, good question. I think a lot of the really good conversations have come from just bumping into Kubernetes maintainers or just open source maintainers in general who I...

18:28

I generally wouldn't have any reason to chat with. And you just get into topics that, you know, since no one's sitting there telling you, oh, you're going to talk about this now, or this is the track, you know, they can be very off the wall. I've had really great conversations with a fellow who created YAML and I sat with several of my teammates and we just like went back and forth on different design components of YAML and what we liked and didn't like.

18:55

And it was a lot of fun. So yeah, hit the hallway track for sure. Yeah. Yeah, definitely. Ruby Tal, any other tips that you thought were helpful for you when you were your first timer?

⁠¶ Conference Booths and Networking Tips

19:06

I think that one of the tips that I got was I went while at KubeCon, I went to the Argo booth.

19:16

and i didn't have specific questions but there was a huge line all the time huge line with people coming and asking some questions so i was just standing there listening to the questions and to the discussions and i gained a lot a lot a lot of things it was super super interesting yeah and we will have there will be an argo project booth at atlanta so come hang out with us the maintainers are going to be there and we'll we'll be rotating so if you come by and you

19:43

don't see the person you want to talk to you can probably come by an hour later or just ask and you'll find out there's like a schedule we maintain your share to figure out when we're going to be there but We've got that. I think Octopus Deploy obviously will have a booth and Rivital and I will be hanging out there. And I think Michael Intuit has a booth at both ArgoCon and KubeCon again this year, right? Yep. Lots of booth duty.

20:04

So maybe part of the reason we like the hallway track the best is because it's the only track that we get to go to for 80% of the conference. Correct. And I have one thing. I'm planning to basically chit chat with people about Argo. So if you have any tip you want to share. Look for me. I want to make a lot of shorts and get basically a lot of tips out of this KubeCon conference.

20:38

and publish them so if you have anything useful anything you want to share any tip you think it's one of the best tips uh you can share with others come and find me Yeah. And actually, Revital and I are going to be doing a live mini episode of Argo Unpacked at the Octopus booth. There's a demo theater and we're going to be doing that, I think, on day one or day two.

21:04

Don't miss that. There'll be lots going on. Come check it out. Great chance to meet people. Great chance to learn a lot of stuff. And of course, we'd love to have you present as well. ArgoConEU CFP. is now closed so you're too late if you didn't if you didn't do it it was it was due uh it was due yesterday so you just missed it if you're planning to do that but if you want to present for coopcon north america next year the opportunity is open

21:29

And of course, uh, there are lots of other things. So if you, if you're interested in speaking, feel free to reach out. If you just want to shoot, uh, shoot the breeze, you can reach out through the Argo unpack social media channels, or you can get me on X at today was awesome.

⁠¶ Argo CD 3.2 Release Details

21:41

And then we mentioned Argo CD 3.2 coming out shortly. I think we already have the release blog for it came out a little while ago. Do we have the release candidate already for this? we've got release candidates but not the ga release um i need to cut that release actually after this oh excellent okay are you planning to do it today

22:06

Yep, planning to do it today. I'm going to do a recording of the process, and that'll be something that other maintainers will get to do in the future. Did we get the Git fix in?

22:21

that evgeny was working on good question uh there were a few last minute um suggestions for that change and i don't know whether those have been buttoned up uh i think there was um basically consensus that we could get away with the ga release without that particular bug fix and then we would just quickly do three two one um but if we can get it into the ga even better

22:46

It looks like, oh yeah, I can see there's a lot of activity on it. It looks like Leo just gave it a thumbs up and an approval. So it's probably ready to merge. I actually was thinking that we should actually fix this. So for people that don't have context.

23:00

um this is a bug that is kind of elusive but basically get the library like the the binary like the actual binary that we ship in argocd uh upgraded to a new version and they included a way where they took some of the cleanup background processes that get runs and they moved them into the background and that meant that when it failed or got stuck it was not detected by argocd

23:27

So the bug is that every once in a while, repo server would lock up and basically you wouldn't be able to render applications because there was a Git lock sitting there. And when that happened, the only way to fix it was to restart Git repo. And it was... incredibly hard to reproduce because you're basically looking for random failures that could happen to a background process of Git.

23:51

And Yevgeny, who's on the OSS team at Octopus Deploy, actually... the one and only pbpf to this thing and he went deep on it and we by the end of like a couple of weeks we had like five or six people looking at this bug and yet getting finally figured out that it was because of a backgrounding of a get process that was failing and leaving a lock in place so anyway um looks like uh it's it's ready to go now i see um

24:20

I see Leo is giving the thumbs up. So I think it's just waiting on you, Michael, if you decide to merge it in. But if it still needs work, let us know and we'll get it fixed. Cool. Yeah. A huge shout out for getting that done. So much of open sources, one person just decides to take a deep dive. And that was the case here. So very much appreciated.

⁠¶ Key Features in Argo CD 3.2

24:42

Yeah, and 3.2, I think we did. Did we already do an episode about what was big in 3.2? Did we already talk about it on Argo Impact? We didn't have a full episode about it, but I think we had. bunch of uh mentions from you this is one that i think natish led right michael and it was his first time being a release manager so love to see people coming up on the project and doing a lot more um

25:14

I know there was some cool features added to support GetOps Promoter from Intuit. So the project that you've been working on, Michael, so that's valuable. I know there's new deletion strategy for Progressive Sync that solves... there's some fuckiness depending on how people were using progressive sync uh where when they went to when it went to go and delete components it would do them in the wrong order and things would get stuck

25:39

And so that's a great improvement. I know there's a lot of work that was done on Hydrator, which is another project that you've really been leading, Michael. Is Hydrator, do you think, ready for people to start? treating as more ga i think it's not i think it's still beta uh api right or is it still alpha

25:56

Still alpha. I think it's definitely worthy of folks trying on maybe some lower risk applications. I mean, at Intuit, we're using it currently to deploy Argo CD itself, and it's been behaving well.

26:11

there are potentially some significant features we're going to add in 3.3 and we just want to make sure that things are really really stable before moving it up to beta uh red hat has indicated interest in moving it along to beta so hopefully we'll get a little bit extra help on any uh i think there may be one or two dangling bugs to knock out there but it's in really good shape i really recommend giving it a try

26:34

We were also talking about moving progressive sync up to beta. Is that going to be done in 3.2 or is that coming after 3.2? I think that'll happen in 3.3. The drive for that came after the RCs had been released. We try not to move much stuff after the first RC. I think the feature that I'm probably most excited for in 3.2 is server-side diff support from the CLI.

26:59

That's a really cool feature to be able to do server-side diffing. We've talked about this before, and this might go into our platform engineering discussion, but within Argo CD or within Kubernetes, there's really three levels of diff. There's the dry manifest before they're rendered. There's the rendered manifest or the wet manifest, which hydrator is doing. And then there's this server side manifest, which may be actually different than what you've rendered.

27:27

And so being able to do a server-side diff really valuable and allows you to cover kind of that last 5% mile of diffing. Yeah, and my teammates, Leo and Peter, have put in a ton of work solidifying server-side diff over the last few months, a couple months, and they will have a talk about all the nitty-gritty details at ArgoCon.

27:50

be sure to catch that yeah that's going to be awesome uh also the first release where we've had git ops engine merged into argocd we used to be a separate project separate repository and That's a work that Leo from Intuit and Pat from Octopus Deploy put a lot of work into. That's going to make everybody's lives easier when they go to continue contributing. Awesome.

28:19

Great news on Argo CD 3.2 coming out. Sounds like either today or tomorrow. So get your hands on that folks and start trying it out. Check out some of those new features. And again, we'll post the link to the blog post in the chat.

⁠¶ Future KubeCons and Platform Engineering Overview

28:34

so that people can read up on that 3.2 release. And then of course, we'll start working on 3.3 and we've got KubeCon Amsterdam in... uh next march so when we come into eu and then north america next year just got moved salt lake is that right back to your neck of the woods

29:00

I can't believe that. I've been, we've been going, I've been going to KubeCon for 10 years now and they went to Salt Lake once and I thought it was like the craziest thing that could have happened. Cause like, what are the odds they do it in Salt Lake?

29:14

and then i don't know what was going on because i think they were supposed to do it in like la or san diego or something and then something happened where it wasn't working and they're like actually we're gonna go back to salt lake we found it a great place to be they loved the mountains and they loved seeing the argomobile i think outside the entrance and so that must have been the thing that tipped him to bring him back i think probably yeah the argomobile i hope it makes a comeback

29:38

Oh, buddy. Oh, buddy. Next year, it's going to be a year. All right. So let's get into our conversation. We're going to talk about the future of platform engineering.

⁠¶ Internal Developer Platforms and GitOps Promotion

29:52

uh you know how how argo cd is part of this and we're seeing i think that maybe the place the natural place that i would start um is in promotions but maybe even to go higher level than that what i've seen as a pattern and tell me if this resonates with you is that every company basically wants to build some kind of internal developer platform And they're trying to solve the problem of how do we make it easier for people to make changes, get it into production. And a lot of people grab Argo CD.

30:28

off the shelf and they say, hey, this is a place where people can log in. They can see all their applications. They can go and see the live status of objects. They can look at their logs. They can do debugging. They can do sync operations. They can do deployments. hey, this is basically my platform engineering strategy here in a box. And that's why Argo CD already has 50 to 60% of the Kubernetes market is using Argo CD today. It blows my mind. That's crazy. That's huge.

30:55

there is that's that's a very dominant could you imagine if 60 of the cars on the road were one brand i mean that's what we're talking about here so this is this is pretty big But that's probably the natural place that I would start is that everybody wants to build kind of an IDP and Argo City is a natural part of that. And then of course, the problem that we've all been chasing this last 18 months or so is really around get off promotion. Is that what you're seeing?

⁠¶ GitOps Abstraction Challenges for Large Orgs

31:19

Michael, or is there a different direction for platform management that you're thinking of? Well, I think environment promotion is a huge part of it, but not the only part. I think that... GitOps, since it's such a simple idea and it plugs into such a well-established ecosystem, it just works really well for so many things that we need to do. But I think that there are a few places where...

31:48

it's either coming up short or it just doesn't happen to be the sole solution to the problem. Environment promotions is a big one. But I think that abstraction might be the other one.

32:01

an abstraction is weird because how much abstraction you need to do and how you do it depends a lot on how big your organization is and i kind of split it up into three layers you've got early stage idps where you just want to enable your developers so you give them a git repository you let them put kubernetes manifests in there maybe you let them use customize

32:27

um and they kind of do what they want and that does enable them to do what they want that works for pretty small orgs then you realize well i need a little bit of control from the platform perspective so you do what you mentioned in your argo tip which is not ideal um you add things to the argo application cr that are sort of platform controlled um and you override what the user would do so you make sure they don't do just whatever they want they do whatever they want

32:56

with a few caveats and then finally you get into later stages where you really need strong separation of concerns and that's the scale that intuits at by the time you have 10 000 repositories with manifests in them you can't let people deploy whatever they want you have to have certain safeguards certain compliance requirements come into play and you've got five different teams vying for maintaining the information that's in these git repositories networking observability anything you can name

33:31

they all need to have their finger in that pie. And it becomes really difficult to do both platform engineering, abstraction, and GitOps at the same time. And I think that the GitOps ecosystem...

⁠¶ Scaling GitOps with Abstraction Tools

33:45

needs to catch up to helping solve those problems yeah that's where that abstraction becomes so difficult because the more you abstract the less granular control typically right that people have

33:59

And that's where larger organizations can struggle because each team has some special needs, right? That they need to do things some specific way or whatever. And so oftentimes I try to... from a technology standpoint try to solve this with judicious use of application sets where i can set up rules that are like hey if you're if your repo looks like this

34:25

here's the outcome of what's going to happen at the end of it. And so if you can just kind of make your repo look like this, you'll be in good shape and you can go and use this thing. But that's only part of the story because... Like you said, if you've got 10,000 apps or let's say you've got 10,000 developers, they all need training. And we talked about this a little bit last week. We were talking about the culture of GitOps.

34:52

there's just there's in many ways it is similar to the nature of devops right devops is all about oh how do i enable self-service and let people run on their own and be independent And GitOps is like, yes, do all those things, but build it around the structure of having real time knowledge about what's deployed and build it around the structure of controlling things from Git and using declarative. And so it ends up being.

35:19

closer to a roadmap about how to actually implement something that allows you to do that but like you said it starts to become very difficult to manage more in a central fashion as you get larger

⁠¶ Argo CD's Role in Platform Engineering Ecosystem

35:34

So obviously you've been working really heavily on doing the, the, the hydrator in Argo CD, which is definitely a help. You've also got gas promoter that you've been working on. Do you think we should make the mission of Argo CD larger? Or is it more about bringing in more potentially platform additional tools that work really well with Argo?

36:02

I think probably more the latter. But I think that the Argo ecosystem is in a really good position to help shape those tools. I think that Argo has really taken off with app.

36:17

developers as opposed to platform engineers i think it works very well for platform engineers deploying infrastructure with rocd fantastic but app developers love it because they get a beautiful user interface that explains to them the state of their application and in many cases kind of teaches them kubernetes like a lot of these app developers you're writing a java application you're not a yaml engineer yet um

36:41

You want something to explain it. Nargo does that really well. So I think that the tooling around how we build abstractions probably needs to improve.

⁠¶ Exploring Abstraction Tools: Helm, Controllers

36:53

I think right now we basically have three options for how we abstract. Either we don't, that is Kubernetes manifests or customize. Customize isn't really an abstraction tool. It's more of a last mile.

37:08

tweaking tool i guess uh helm i think is basically the de facto deploy time abstraction tool so by deploy time i mean you push to get and then argocd it's doing the deployment it runs helm template and it takes minimal inputs blows it up into outputs that's your abstraction layer i think that you can layer customize on top of that i really like the helm chart inflator feature of customize

37:37

um so that you can basically tell users here's your helm inputs that's your safe abstraction plus you get customized and that's where you're sort of on your own and um you know use it your own risk And then kind of the final abstraction tool is controllers. That's a really high barrier. And I think that's why we don't see a lot of IDPs built on here is like the Intuit app CRD.

38:02

Because maintaining a cube builder, even with tooling like cross-plane and KRO coming along, it's still a lot to ask to build a controller and maintain it and versioning and all that. Yeah, I actually had an experience with a company who had, they had taken everything in their entire company. This is a large company. I mean, we're talking over 3,000 engineers.

38:27

And they had made everything into its own CRD with its own operator. So if you wanted to provision a new web front end, there was like, you just did the CRD and it had an operator that ran it. And this is pre-crossplane, right? And so they abandoned that within about 12 months. I think that was so crazy.

38:51

to imagine to do it and crossplane gets you closer right because you can make a composite resource and then you just chuck it into argocd and you're like it works pretty well like i got i can build my composite resource i can leverage things that are outside of kubernetes if i need to

39:06

And that gets you a nice chunk of the way there. But I haven't seen the same level of pickup and adoption of cross-plane that I had expected initially. And I think it's... probably driven mostly by people being really bought into terraform and not wanting to not seeing the the value of a constantly running operator for some of those non-kubernetes resources

39:31

But, but Ruby Tal, actually that topic, you, you and I went over a list of questions that you had before that you were thinking of. And this, this actually introduces that first question that you had written down.

⁠¶ Git Declared, Runtime Computed, and AI

39:46

My first question, my draft has changed so much. You asked, how do you define the line between declared in Git versus computed at runtime? That one. do you want to add anything more to that question let me let me open that because i had many questions about ai that i want to talk about as well Well, so maybe we should answer the line between declared and Git versus computed at runtime while you're looking at those AI ones. So Michael, you want to take first swing at it? Sure.

40:29

I think that probably moving everything to runtime is not the best idea. I think the things you want calculated at runtime are things that are built on... really well established technology. And that probably means it's not built internally. You're using the native Kubernetes deployment controller or Argo rollouts. You're using really solid.

40:56

resources or CRDs. I think that it makes sense to move most of your internal stuff into being not necessarily declared in Git in the sense that the user is writing it all out. Because sometimes you have things that you don't want the user to bother with. Like you've got some type of security settings that you automatically apply to deployment on the user's behalf.

41:24

If you want it to be transparent to the user, I think the deploy time is a great time to do that. You have a helm chart that automatically adds this platform concern to the manifests. And I think that source hydrator is.

41:38

an excellent thing to couple with that because it exposes that platform functionality to the user. I think sometimes we think of abstraction as being... basically equivalent to obfuscation like we don't want the users to see what's going on behind the scenes but i think that abstraction should be an invitation to collaborate if you're if you're transparent with your users

42:02

they understand what's going on if something's not working well they can suggest to you how to fix it so kind of a long way to answer um i think that declaring in git slash Using a hydration tool to push to get where things are very visible is really probably the best move for most internal uses. Yeah, there's the server side stuff. I tend to think of like computed at runtime, like things that application sets do. So if I've got a pull request generator.

42:38

it's going to be looking and figuring out what pull requests are open and it will generate the application off of those. And so making a Git commit for every pull request that's generated, that's in addition to the pull request just existing at all. to be like, okay, go create this application feels not useful. I've already got it in Git. It's already sitting in a pull request. Just go generate the application and leave me alone. And as soon as I deal with it in the pull request.

43:05

i don't introduce some extra interface i just deal with the pull request if i shut it off the pull request the application gets deleted we move on with our lives so stuff like that works um really really well uh cluster generators tend to work really well um though most of those things end up being computed or set inside of git right uh and then you've got things that are like

43:27

Where people often want to use it are things that they would use Helm lookups for. And I think Helm lookups have been really abused. And of course, we don't even support them inside of Argo CD directly. You can use a CMP plugin, a config management plugin. to build support for uh helm value lookups which i did and it's terrible and uh so i don't recommend it um but as an experiment to see like hey this is what it would look like and here's how it introduces security problems or whatever

43:54

But Helm value lookups, a lot of people will say like, okay, I'm going to create these values on my cluster and then I can inject these values into my manifest. But Kubernetes already has config maps, secret mounts. and so most of those things should actually just be in secrets and if you're using something like a secrets operator external secrets operator when you update the secret it can go and actually restart and update

44:16

the pods so they get the new value. Whereas if you're doing a value lookup and you change the value, you now have to go and recompute. the manifests and regenerate them and reapply them and so you've introduced an extra step so that's where it gets into like the duplication of effort that is probably not that useful but Yeah. So that was a good question, Ruby Tal, that you wrote down and that I asked on your behalf because you wrote it down. So thank you. I want to talk about the state, right?

44:48

right now github is all about thinking that the the desire state versus the actual state so that the desire state is something that is static and maybe i'm thinking like five years going forward five years from now. But do we think that one day we will have something like dynamic state, right? So it's not only about the state that I store in Git, but it also basically, what about performance?

45:16

performance and cost and compliance, these are all dynamically, they will be changed dynamically, right? Yeah, we actually already do. with something like a horizontal pod autoscaler where it's changing the number of replicas that exist. That's an example where in Argo CD today, you can set an ignore difference on that field.

45:41

And you can let an operator run under some parameters. It might have a max number of pods that can run. So you have some sort of policy set around it. And then it's changing the state. Now, when it comes time to do a disaster recovery, you're not really worried about... getting the replica number right because your autoscaler is going to go and update that for you. So we do actually have some things like that, but...

46:07

I do think we probably will see more of those kinds of things, right? Of course. And as we have the AI that is coming and helping us, I'm assuming we will be way, way more sophisticated.

46:22

Well, I think that the question ties in really well with discussing GitOps and platform engineering. I've got a really good example from Intuit right now. On top of PneumaProj, which is... uh a tool that into it open source to do kind of stream streaming applications and analysis we have an ai system that monitors applications and it creates a score

46:49

and it's we call it anomaly score either this app looks like it's behaving basically normal or it looks like it's behaving weird and maybe we need someone to come intervene or we want to gate a rollout deployment on that number But the user can't just really fire and forget with this tool. Sometimes there needs to be some tuning. The exact threshold is something that's more art than science.

47:18

So I think the way that AI can effectively collaborate with a platform in a really nice way in this case is open a PR and the AI agent can say, all right. I've done some analysis. I've calculated that the threshold score that makes sense for your application is this. Take a look. Read the docs, read the explanation, and then choose to merge it or not. But then...

47:45

You have a record of what the AI is doing. You have a record of a human saying, yep, that makes sense. And then you have it stored in the intent of the application in Git. Other people can look at it and go, yeah, this is how the application is supposed to be behaving according to these various inputs. That's actually something that I really want even more than at the infrastructure level would be amazing at the user level.

48:11

right so like if you're if you say like oh like you introduced this new feature these metrics did are doing these things and i've got a system where i can automatically like look at the different metrics and be like oh You introduce this feature that correlates to this thing changing, these metrics changing. And hey, you may not want this one changing this way. That may not be good. And heads up on that. And this one actually is changing.

48:38

in a really great way and you might be able to improve it by even introducing this additional little feature and let me go make a pr for that you can go look at it like that like working at like the user level would be really cool but i don't know that

⁠¶ AI's Impact on Deployment Frequency

48:51

The biggest change for that at an operational level, and this is true for all of AI, and I think I submitted a talk about this as well, and Victor and I talked a little bit about it, is that... When you are deploying more frequently, which is the output of what we're talking about, when you're deploying more frequently, if you experience the same failure rate, your frequency of failures will...

49:16

increase at a commensurate rate with your increase in deployments, right? So if your failure rate stays at 1%, but suddenly you're deploying 100 times more frequently, you'll experience 100 more times. 100 more failures in a week than you did before. And that becomes a really big deal. So the pressure to get CD really well tuned.

49:40

right and this is like why i work at octopus deploy right it's like it's like i look at that problem and i'm like oh ai is going to make that problem so much bigger and we're going to have such a bigger job we're going to have so many more customers because they need to improve their cd and get it

49:54

really well tuned because if they can't reduce you have to reduce the failure rate and i don't think anybody else is really focusing on that right now they're all saying wow look at all these productivity gains But the failure rate hasn't really changed. So when you do that, then you have to find a way to lower the failure rate. Yeah, totally makes sense.

⁠¶ Building Trust with Automated GitOps Sync

50:20

Yeah, Revital, so I don't know if that was a satisfactory answer for you. Who knows? We'll see, right? Fair point.

50:34

But Michael, I have to say, I like your solution with the human in the loop. We don't let the machines do everything automatically yet. So there's always a human in the loop that confirm and... approve now there's a build up in the trust phase just like with argo cd most people when they first use it use manual sync yeah because they're like auto sync what does that do but after they click sync

51:03

And it just works for like the 30th or 40th time. They're like, why am I even in this loop? I'll just sort of switch it to auto. And as long as I built my controls and get properly. then we're good to go and this thing can just run. That's like the Argo CD singularity. That's when the rules break down and everything moves faster than you could have possibly imagined. Whole new world.

51:27

And the product has to build that trust to get you to that point. And I think that that's where like human in the loop workflows are going to be really valuable or, you know, at Intuit, we talk a lot about done for you, but I think. in early stage what you want is done with you you want to build confidence that this tool is behaving the way that you would expect it to so that at some point you can hand it off

51:50

And that's a really great feeling in Argo CD when you can finally hand off the sync operation to be fully automated. And also why I spend so much time on environment promotions because... We've failed to make that possible in the GitOps ecosystem for application delivery where there are lots of environments involved. Yeah.

52:11

And that's where a lot of people feel it because they're moving from dev stage production, but then they're moving to cross regions. I mean, we've got users that are like, hey, I've got to deploy to 3,000 or 30,000 different. deployment targets as part of a promotion. And if I don't have a way of managing that at an abstract layer, I'm going to be in trouble to bring it back to your original point about is there going to be.

⁠¶ Policy-Driven Automation and GitOps Philosophy

52:37

more meta state inside of kubernetes and i said there already is right with something like an autoscaler um i do think that we we'll see those kinds of things. We'll see more of those kinds of things, but they're always going to be policy driven, right? So the policy is always going to be around like, like with a pod autoscaler, it's like, what's the maximum?

53:02

you know surge you can do and and like what are what are my limits on this and what am i what's my low end look like and from a policy perspective that's how you manage that state right is you set some kind of policy and then you have an automation engine that's that's going through and tweaking things as needed within that policy and this is an area where uh i think

53:27

I, one of the reasons that I didn't gravitate towards flux or want to bring in, let's put it this way. One of the reasons I didn't want to, something I didn't want to bring in from flux. is that they really had that philosophy of if any change is ever made, it always needs to be a get commit. And to me, it always felt like a little too ideological, like making a get commit.

53:53

because you want to change the pod replica number 35 times a day it just feels like a bunch of junk in your git history that's not useful like you have a policy around what how many replicas you should have so just let that be the policy and then it can go up and down and you're not really worried about it if you need to do a disaster recovery you're fine

54:16

um but you can delegate things to controllers i think that's one of the most powerful important features in argo cd and i think it's totally underrated i think people undervalue that because without the ability to Actually, this is something we could do in Argo CD, right? Conceptually, when you create and ignore differences, you are delegating the control of that field to something else. And in Argo CD...

54:43

You do that, but you imply it by setting the ignore field. You don't set an owner. So like something we could do in the near future is have when you create those ignore fields, even just a label on them to say like. Oh, this is actually a delegation to this thing. And there might be some way of creating, you know, like a more generalized policy around that kind of thing where it's like, oh, anytime somebody deploys a pod autoscaler.

55:09

we're going to automatically ignore differences on the replicas to the fields that it has configuration for. There's things like that that we could do that would be really smart inside of Argo CD. Yeah. I actually want to be able to say, hey. Run this service in Europe or North America under 100% error rate under $100 a month. That's what I want to be able to say.

55:37

And maybe have it dynamically change where it's deployed on the cost of running services at that time of day. Yeah, that's cool. I don't know if that's going to be an Argo CD job. I think that's a separate.

⁠¶ Git as Interface, AI for Optimization

55:51

Yeah, well, an Argo CD could be involved. I think you've put it really well about you don't want every little thing to be in Git. To me, GitOps is about humans. we're communicating via Git as an interface. And it's awesome to communicate through it because there's so much tooling built around it. And if it's useful to have an AI to communicate to you, hey, you know.

56:20

Here's your deployment manifest. Here's how it's gonna save you money changing it from this to that You know, do you want me to proceed then? You know, it's a fantastic combination of Argo CD and GitOps plus AI workflows. If you'd rather just have it done for you, keep it out of Git, put it in a controller and just let it run. You just have to be very confident that the system functions really well.

⁠¶ Episode Conclusion and Community Outreach

56:44

Cause you don't want to hand off something and then end up, you know, losing traffic. Yeah, absolutely. All right. So we're coming up on the end of the show. We didn't even talk about relationship between platform engineers and software developer. So that will be maybe another episode. Yeah, maybe. We could have a group therapy session around that one. True.

57:14

So Revitel, as we come to the end, what do the people need to know? And thank you again, Michael, for joining us. I think that was a fun discussion and hopefully people want more of it. We'll come back and do a part two. So first of all, subscribe here, Argo Unpacked. In the website, we have all of the previous episodes, all of the upcoming episodes. It's not the video of the week, but every two or three weeks we're showing a different fantastic short that we have generated thanks to the show.

57:49

We also have our social LinkedIn X and Blue Sky. Stay tuned. Subscribe. We have our YouTube channel. We recently generated a lot of very nice shorts. So check them out. Oh, that reminds me, Revital. Today's Argo tip is brought to you by GitOps Champions because that's where the conversation started. It's the Discord.

58:17

that when you sign up for the GitOps training and certification at learning.codefresh.io, you get an invite to a Discord where we talk all about GitOps and implementation. It's full of practitioners.

58:28

so feel free to join the getops champions discord because that's where that conversation came from and that's why we actually talked about it this week uh and then i'll also add really tell that right after coopcon north america we're going to be doing a recap an octopus deploy webinar recap on tuesday november 18th we'll share a link to this but it'll be john and i talking about what we just kind of the the findings from coopcon and then we'll be back

58:56

with another episode of Argo Unpacked on December 1st. So plenty more coming up, lots to do. And definitely, you know, we're gonna be heading into 2026. I think it's gonna be the year of GitOps. the year of magical, wonderful platform engineering and Kubernetes deployments for everybody. So any final words, Michael, anything you need to promote or put out there? Shout out to Intuit.

59:23

and my team for making Argo as awesome as it is. Go check out Try Source Hydrator, alpha feature in Argo CD. I think it's going to be a huge improvement for platform engineering and GitOps. And we didn't really get into it much, but check out GitOps Promoter. It's something my team has been working on a lot to make environment promotion in GitOps as seamless as the rest of GitOps.

59:47

Yeah. Check all those things out and file your taxes with TurboTax. Tax season is going to be coming up soon. Get those receipts in order. If you want to expense this show. you could for professional development but unfortunately it is free so it's going to be limited in its usefulness as a tax vehicle at least that's my understanding michael yeah i think that's legally correct not legal advice though

01:00:12

All right. Well, thanks everybody for joining us on another episode of Argo Unpacked. Stick around and hang out in our chat or I should say on the social media, leave a comment. We'd love to hear from you. And we'll see you all next time. And hopefully a lot of you will see in Atlanta next week. Thanks everybody. Thanks.

✨ This transcript was generated by Metacast using AI and may contain inaccuracies. Learn more about transcripts.

The future of gitops and platform engineering | Argo Unpacked Ep. #17

Summary

Episode description

Transcript