Welcome to the deep dive. We're your shortcut to getting genuinely well informed fast without drowning you in details.
Yeah, we focus on those core insights, those aha moments exactly.
And today we are taking a really fascinating deep dive into the world of Argo CD and git ops. We're using Argo CD up and running a hands on guide to get ups and Kubernetes by Andrew Block and Christian Hernandez as our guide. It's a pretty comprehensive.
Definitely, and our mission today really is to unpack what gid ops actually is, how Argo CD brings it to life, and why it's becoming such a standard for cloud native apps.
Right. So, whether you're prepping for that big meeting or just trying to stay current, or maybe you're just you know, curious.
We'll give you the essential nuggets you walk away understanding this whole paradigm.
Okay, let's start unpacking getups. We hear this term everywhere, so at its very core, what is it?
Well, the book defines it pretty simply. It's basically a method for managing your infrastructure, your applications. Yeah, by describing the state you want them to be in declaratively in get and using git as that single source of truth.
That sounds simple, but I guess the revolutionary part is the shift right from telling systems how to do things exactly imperative commands to just defining what you want in a way that's like auditible and trappable because it's.
In geit precisely. Yeah, that's the real aha moment there. And you know the origin story is pretty interesting too.
Oh yeah, tell me about that.
It goes back to twenty seventeen with leave Works. They apparently had this incident where someone fat finger to config.
Change oops we've all been there, took.
Down their whole platform. Yeah, but the way they recovered by rolling back using git it sort of revealed this new better way of working.
Ah. So necessity is the mother of invention.
Kind of Yeah. And their CEO, Alexis Richardson, he's the one who coined the term git ops. It came out of a real mess basically.
Okay, that makes sense, and it's important to note right this isn't some completely separate thing from.
No, not at all. The book is really clear on this. It says get ops is DevOps. Get ops is the natural progression of DevOps.
So it's like formalizing the good stuff people were already trying to do version control automation exactly.
It takes those DevOps best practices and puts them into a really solid, audible automated workflow, like putting guardrails on things.
Gotcha. So if it's a progression, what specifically defines it? Are their rules?
Yeah? There are the open getops principles, which came out in twenty twenty one, lay out four key tenets. First, it has to be declarative.
Declarative, meaning you describe the what, not the how, like a Kubernetes manifest.
Precisely your entire system, state, apps, infra networks, whatever is described declaratively, you say what the end state should look like. Okay.
Principle one declarative what's suck.
Second, versioned and immutable. The desired state lives entirely in.
Geit ah the source of truth part exactly.
And because it's GIT, you get a complete, unchangeable history of every single change, auditing rollbacks, it's all built in.
That's huge. Okay.
Third, Third, and this is a really key difference from a lot of traditional CICD. Changes are pulled.
Automatically, hold, not push right.
Instead of a CI server pushing updates, you have get ups agents or controllers inside your environment, actively pulling the definitions from Git. They run this reconciliation loop.
A reconciliation loop like a little robot constantly checking.
Yeah, constantly checking does the live system match the blueprint in Git. It makes it more secure, more robust than just reacting to webbooks.
Interesting, okay, And the fourth principle follows from that.
It does it's continuously reconciled. Those agents are always observing the live state and working to make it match the desired state from GIT. It's like Kubernetes controllers, but for your whole application stack, always self healing, always striving for that get state.
Okay. That makes get ups much clearer. So where does argo CD fit in this landscape?
Rgo CD is a really prominent tool in this space. It's part of the wider Argo project and those built specifically for getups. Developer experience was a big focus too.
And its main job is its core purpose.
Is delivering changes to Kubernetes clusters, potentially at massive scale, following those getups principles we just talked about, and.
The big value proposition seems to be around configuration drift.
Absolutely, that's where it really shines. It actively detects and prevents configuration drift. You know, when your live cluster gets out of sync with Git because someone made a manual change or something.
Broke, rgo CD sees the mismatch and flags it or fixes it both.
Potentially. It constantly compares the YAMEL and Git to the live state. If they diverge, it lets you know, and you can configure it to automatically sink things back to the desired state.
Okay, and you mentioned it brought structure to Kubernetes management.
Yeah, Before tools like rgo CD, managing all the differ Kubernetes pieces to deployment services can fig maps could be a bit.
Loose, right, just a bunch of Yamo files.
Rgo CD introduced the concept of an ARGOCD application. Think of it as an atomic unit of work. It bundles everything related to your app, the deployment, the service, ingress, everything into one manageable thing.
So deployments become more coherent, audible, automated exactly.
You're dealing with the whole application picture at once.
And it's flexible too, right, not just for strict geit ops.
That's true. The book points out it can function as a general purpose deployment tool even if you're not doing full git ops yet, and it integrates nicely with tools people already use like Helm and customize. We'll get into those more cool.
So it's adaptable. Let's maybe peel back the layers a bit. What's the architecture look like under the hood.
Sure, so RGOCD itself follows on Microservice's architecture. Is made up of several smaller independent components working together.
Okay, typical cloud native design, right.
And it leans heavily on Kubernetes primitives. It uses Kubernetes concepts effectively. What's really fundamental is how it implements the Kubernetes controller pattern.
Ah, the reconciliation loop idea again exactly.
Just like built in Kubernetes controllers watch resources, rg CD watches two things, the desired state and get and the actual spate in your clusters.
If there's a difference, it steps in and enforces the get state prevents that.
Drift precisely, it's constantly reconciling.
So what are the main components doing this work?
Okay, First, you have the repository server. Its job is to maintain a local cache of your Git repos or Helm chart sources casing for speed yep, and then it generates the final Kubernetes manifests from whatever source you're using, Helm customized plane YAML.
It preps the manifests, got it manifest prep station what else?
Then there's the API server. This is the main hub. It uses gRPC and rest pretty standard stuff for communication, and this manages everything really application status, triggering, sinks, handling, RBAC, role based access control, you know who can do what, and it also serves the web UI.
The user interface runs off This API server.
Makes sense, but for cashing it uses rehttis.
Reddus okay, fast in memory database.
Right is it for local caching to speed things up, reduce calls to get and store some temporary state. Important note though this redd as cache isn't persistent. If rgo CV restarts, it rebuilds the cash.
From get good to know not a permanent data store.
And finally, finally, a really key part its own custom resources or crds. These extend Kubernetes itself. Rgo CD adds things like application at project and application set.
So you manage rgo CD deployments using these Kubernetes native objects exactly.
These crds become your main way of interacting with rgo CD and defining your gitups deployments within Kubernetes itself.
Very cool. Okay, so how do you actually get this thing running and start using it?
Installation is pretty standard Kupernettes stuff. You can use the raw yamal manifests they provide our helm or yeah, more commonly these days helm charts. The book guides you through setting it up into local kind cluster Kubernettes in Docker, which is great for just playing around.
And learning and interacting with it. I hear the UI is quite good, it really is.
The user interface gives you a great visual overview. You can see what apps are doing, theirs, SINC, datas, health, managed settings. It's very user friendly.
How do you access it? Initially?
You can start with cubictol PORTFOD, just a tunnel into it quickly, but for anything more permanent, you'd set up a proper Kubernetes ingress to give it a real host name like rgo CD dot, your domain, dot.
Local or something right, make it accessible like a normal web app. What about the command line.
Yep, there's the rg CDCLI tool. It actually offers more functionality than the UI, sometimes hitting the same back end API. It's where you do more advanced things like adding remote clusters declaratively.
UI for visuals, CLI for power users, and automation. But here's the really neat getops part I read about. You manage rgo CD itself declaratively.
Yes, this is super elegant. Rgo CD's own configuration like connection details for get, rebos, security settings, UI tweaks is stored in standard Kubernetes confct maps and secrets.
So you can put rgo CD can fig in get and have ARGOCD apply its own can fig using.
GetUp exactly, it manages itself. You define its configuration, declaratively commit it, and rgo CD picks it up and reconfigures itself. It's get oops all the way down.
That is elegant. Okay, let's get to the heart of it, managing and synchronizing actual applications. This application CRD seems key.
It is an rgo CD application is basically a pointer. It tells ARGOCD about a set of Kubernetes resources that belong together. It primarily defines two things.
What are those?
First? The dot sbec dot source. This is where the manifests live, usually a get repo URL, maybe a specific branch or tag and a path within that repo.
Okay, where the blueprint is right.
And interestingly, newer versions like V two point six onwards support multisource applications.
Oh what does that mean?
It means you can pull manifests from multiple sources like maybe your main app deployment from one get repo, but it's database configuration from another, and combine them into a single rgo CD application. Really powerful for complex setups.
Okay, that's the source.
What's the second part, the dot spec dot destination. This just tells our docd where to deploy those manifests, which Kubernetes cluster and which namespace within that cluster could be.
The same cluster rgo CD is running on or different one.
Yep, you can specify in cluster for the local cluster or provide the API server address for a remote cluster.
And to manage the actual YAML complexity you mentioned, helm.
And customize absolutely. You rarely want to be just copying and pasting raw YAML everywhere. RGOCD has first class support for Helm, which is kind of the standard package manager for Kubernetes.
Now right for templating and managing dependencies, and.
Also Customize, which is great for applying environment specific patches or variations to base manifests without actually changing the original files.
You use bases and overlays, so RGOCD works with these tools to render the final manifests before applying them exactly.
It uses the repository server component we talked about to run Helm or customize base on your applications tech.
Okay, now the magic part synchronization. You said by default nothing happens when you create an application.
Correct, and this is surprising but also really important. By default, creating an application resource just registers it with rgo CD. It doesn't actually apply anything to your cluster yet it gives you a chance to review. You can look in the UI see what argocd would deploy and make sure it looks right before you manually click sink or enable automation. It's a safety net.
Ah okay, a deliberate pause, but you can automate it. Oh.
Yes, you define a SINK policy in the application manifest. This lets you turn on automated.
SINC And what options do you have there.
Key ones are pune true, which tells RGOCD to automatically delete resources from the cluster if they're removed from get garbage collection.
Essentially important for keeping things clean, and.
Self heal true this tells RGCD to automatically fix any drifted to texts. If someone manually changes something in the cluster, self heal will revert it back.
To the gets stay so truly, enforcing that get source of.
Truth exactly the thing. Policy also lets you set up retry strategies. If a sinc fails with backoffs and.
Limits, what about controlling things at a finer grain than the whole application you.
Can You can use Kubernetes annotations directly on individual resource manifests in get what. For instance, you might annotate a persistent volume claim with rgocd dot rgoprojei dot iosinc options prune false to prevent argo CD from ever deleting it even if it's removed.
From gitt ah protecting stateful things right.
Or there's skip dry run and missing resource true, which can be useful if your application installs its own crds or operators. Sometimes the dry run fails initially because the CRD isn't there yet, So this lets you bypass that check for specific resources.
Okay, annotations for fine tuning. What about running tasks during SINC like database migrations.
That's where hooks come in. RGOCD lets you define Kubernetes jobs or pods as hooks that run its specif points in the sync life cycle.
When would you use those?
You have precinc hooks that run before anything else sincoks during the main sync, post sync, after a successful sink, sink fail if it bombs out, and even post.
Delete, so post sync would be perfect for database migrations, apply the new app code, then run the migration job exactly.
Or precinc could be used to say take a database backup before and upgrade. Very powerful automation.
And for really complex apps with lots of dependencies, how do you manage the order?
That's where sink waves are incredibly useful. Sometimes you need resource a like a database, to be fully up and running before resource b your application starts.
Right dependencies, sink waves.
Let you assign numerical waves to your resources using annotations. RGCD applies resources in wave order, waiting for resources in one wave to become healthy before starting the next wave zero, then wave one, wave two, and so on.
Ah, so you can orchestrate complex rollouts precisely.
Precisely. It prevents things from failing just because of dependency wasn't ready yet.
One more sinc thing. Ignoring differences Sometimes the live state should be different, right, like with autoscaling.
Good point. Yes, rgo CD lets you configure it to ignore specific differences. Maybe you have a horizontal pod autoscaler managing the replica's count of a deployment. You don't want rgo CD fighting with the HPA, so.
You tell RGOCD, hey ignore changes to spec Dot replicas for this deployment exactly.
You can configure these ignores at the application level or even globally for certain resource types or specific JSON paths across your whole RGOCD instance. Very flexible.
Okay, makes sense. Now what about application health? How does rgo CD know if a deployment actually worked?
It relies heavily on kubernators built in health checks. This means defining proper readiness and liveness probes in your deployments in state ful sets is absolutely critical.
Because RGOCD uses those probe statuses to decide if a resource and therefore the whole application is healthy exactly if your probes aren't set up correctly, rgo CD might think app is healthy when it's not, or vice versa.
Good probes are fundamental to reliable get ops.
Does argo CD add its own health checks too?
It does for many standard Kubernetes resource types. It has built in logic to understand the health of deployments, services, stateful sets, etc. However, there's a catch. Oh a surprising fact, Yeah, surprising faction. In the book, the built in health check for rgo CD's own application CRD was actually removed by default back in argo CD version one point eight.
Wait, why would they do that and why does it matter?
It matters if you're doing complex orchestrations where one rgo CD application needs to know if another argo CD application is healthy before it proceeds, like in the app of apps pattern we might discuss.
Ah, So if you need applications to depend on each other's health, you need.
To explicitly re enable that application health check in the main argo CD configment rgo CD cny it's off by default, which can trip people up when building those advanced workflows.
Good tip. Okay, let's shift gears to more advanced topics operationalizing this stuff. Authentication and authorization seem crucial.
Absolutely out of the box, RGCD gives you a single admin user. The initial password is in a Kubernet.
Seekert change it immediately, yes, please?
Then you can define additional local users. But more realistically, in an enterprise setup, you'll integrate with single sign on sso.
Like using Octa or Google Workspace or something exactly.
Rgo CD supports standard protocols like openetd Connect, ODC, or you can use an identity broker like decks, which can then connect to LDFP, SAML, get up whatever your company uses. The book uses key cloak as an example of you decks.
Okay, so users log in with their normal company credentials. What about what they can do once logged in.
That's where rg CDs built in RBAC comes in. Role based access control. It governs permissions.
Does it have default roles?
Yes? It starts with a powerful role Dot admin and a useful role dot readily, but you can define your own custom roles using simple CSD policies stored in configmap.
So you can get really granular, like this team can only deploy to the devname space.
Precisely, and that leads into projects or app project resources. These are fundamental for multi tenancy or just organizing things.
Let do projects control.
They act as logical groupings for applications. Crucially, they let administrators restrict things within that.
Group, like what kind of restrictions you can.
Restrict which get repositories applications in that project are like to pull.
From AH security boundaries.
Exactly which destination clusters or name spaces they can deploy to. You can even blacklist or whitelist specific kinds of Kubernetes resources, maybe disallow creating cluster rolls from a certain project.
Wow. Okay, So projects are like security and policy sandboxes for groups of applications.
Yep, and you assign permissions which users or sso groups can access the project and what roles they have at the project level too.
There's also something called applications sinc impersonation that sounds important for security.
It is, and it's relatively new from v two point one four onwards. It lets you specify a different Kumbinaties service account for RGOCD to use what it actually applies manifest to a cluster for a specific application.
Why is that better than ARGOCD just using its own default service account.
Principle of least privilege, the default argo CD service account often needs fairly broad permissions to manage crds and things. By using impersonation, you can create a dedicated service account for an application that only has permissions to manage deployments and services in the specific name space. For example.
Ah so, even if the main RGO CD controller was somehow compromised, the blast radius for what it could do during a sink is much smaller.
Exactly, it decouples the control planes permissions from the sink execution permissions. Big security win makes sense.
Okay, what about managing multiple clusters? How does that work?
RGOCD uses a hub and spoke model. Your main RGO CD installation is the hub. It then manages deployments out to potentially many spoke clusters.
The control plane is centralized generally.
Yes, the hub pushes or more accurately applies the configurations defined in get out to the managed clusters.
How do you add those remote clusters?
You can use the rgo CLI. There's a command like rgo sed cluster ad or more get up style. You can define the cluster connection details. API server endpoint credentials within a kubernating secret and apply that secret to the rgo CD control plane cluster.
So the cluster definitions themselves live in geit.
Yep RGOCD watches for secrets labeled correctly and automatically adds them as managed clusters.
Okay, so you have multiple clusters defined. How do you deploy the same app to multiple clusters? And application CRD only points to one destination?
Right, correct? A single application resource maps one source to one destination, but there are patterns for deploying to many. First is the app of apps pattern.
App of apps? What's that?
It's basically an rgo CD application whose source git rebo contains manifests for other rgo CD applicants meta.
So one application deploys more application Exactly.
You might have a staging apps application that deploys the specific application resources for your AP service, staging front and staging, et cetera. It's great for boost wrapping environments or managing application stacks together.
Clever, what's the other way?
Application sets? These are designed specifically for this multi deployment scenario. Think of an application set as an application factory. A factory you define an application set with a template for an rgo CD application, but you leave parts like the destination cluster or namespace dynamic. Then you use a generator generators like what for several A list generator lets you
just provide a static list of clusters or parameters. A cluster generator automatically finds all the clusters registered with rgocd and creates an application for each. There's even an sem provider generator that can look at your Git hosting like GitHub git lab and create applications based on folders or branches.
It finds wow, so application sets automate creating many so smilar applications targeted at different places. Very useful for platform teams.
Hugely useful, reduces boilerplate can figure dramatically.
Let's stock more security TLS certificates, connecting to private Git repos critical stuff.
Rgocd obviously supports configuring TLS for its own EPI server and UI. You can bring your own CERTs or generate self signed ones. It also handles connecting securely to get repositories. How for HTTPS repos you can store credentials like username, password, or access tokens in Kubernetes. Secrets that rg CD is configured to use for SSH. You sort the private key in a secret and configure the known host entries again, usually via rgo CD's configureent secrets managed.
Via git ops Okay, standard secure connection methods. What about verifying the commits themselves?
Yes, for extra assurance, you can enable get commit signature verification using GTG gpgkeys, so.
Argo City checks if the commit was signed by a trusted developer key before sinking.
Exactly, you can figure rgo CD with the public keys of your trusted committers. If a commit in the tracked branch isn't signed or is signed by an unknown key, rgo CD can refuse to sink it. It ensures code.
Provenance, important for regulated environments or just high trust. Okay, scaling this up monitoring essential.
You absolutely should integrate rgo CD with Prometheus for metrics collection and Grafana for visualization.
What kind of metrics do you get?
You get metrics about argocd itself, CPU memory, usage of its components, API requests, latency, number of applications being managed, but also crucial metrics about SINC operations, how many sinks are happening, how long they take, success failure rates, it gives you operational visibility beyond just your own app metrics.
And getting alerted when things go wrong.
That's where RGCD notifications comes in. It's an optional but highly recommended component. It watches rgo CD.
Events like sync failed or an app became unhealthy.
Sync success failure, health degradation, sink starting, lots of triggers. Can then send nicely formatted notifications to slack matter, most email page, your duty teams, you name. It keeps everyone in the loop automatically. Very handy.
What about keeping rgo CD itself reliable? High availability definitely needed for production.
Rgocd's components, the API server, reposerver, application controller are designed to run with multiple replicas for high availability BJA. If one pod dies, another takes over. You typically manage this with standard Kubernetes deployments and potentially horizontal pod autoscalers.
And if you have thousands of applications, can one controller handle all that reconciliation It.
Can become a bottleneck for really large scale RGOCD supports charding the application.
Controller starting splitting the work exactly.
You can run multiple replicas of the controller and rgo CD will automatically distribute the applications across them. Based on a hashing algorithm, each controller instance only manages a subset of the total applications. This prevents one controller from becoming a hotspot and improves overall sync throughput okay.
Jay and scharding for scale. Can you extend rgo CD's functionality if it doesn't support your specific tools out of the box.
Yes, through config management plugins CMPs. Let's say you have some customs scripting process to generate your Yamel or maybe you use a less common tool.
You can teach rgo CD how to use.
It pretty much. If you package your tool or script into a container image to find how RGOCD should call it in the RGCD s endemioconfigmap and rgo CD's repository server will run your plug in in a sidecar container to generate the manifests. It's very flexible.
Nice. What about customizing the look and feel of the UI.
You can do that too. You could apply custom CSS style sheets to change colors, fonts, layout, maybe make the top bar red in your production. Rgo CD instances as a visual cue.
Simple but effective. Can you add new UI elements?
Yes? Through UI extensions. These are more involved. You basically build custom react components that get loaded into the RGO CDUI. You could use this to add buttons that trigger external actions or display information from other systems right within the RGO CD interface.
Cool. Last major area integration with CI. How does rgocd fit into the broader CICD pipeline.
Good question. We mentioned The reconciliation loop runs periodically, maybe every three minutes by default, but often you want changes deployed faster.
Right as soon as they merge to maine.
That's where web hooks come in. You can figure get GitHub, get lab, et cetera to send a webook notification to argocd whenever there's a push to the track branch. Rgo CD can then immediately trigger a refresh and sink if needed. It provides that on demand synchronization, so.
You get the continuous reconciliation and fast updates via web hooks.
Best of both worlds. And you can integrate this further, maybe your CI process using something like Tecton pipelines.
Tecton the Kubernetes native CICD tool.
Right your Tecton pipeline could run tests, lynch your manifests, maybe build an image and then as a final step. After everything passes, it merges the change, which then triggers the RGO CD web hoook.
So CI handles the validation before getaps takes over for deployment exactly.
You can even have tech ton directly interact with the RGO CD API if needed. Technon triggers can listen forget events to start these pipelines automatically. It creates a really robust, fully automated flow from code commit to running.
In the cluster that paints a really comprehensive picture looking ahead. What are some ongoing discussions or future considerations in the getop space.
One big one, especially for organizations starting out, is getops directory structures. How should you actually organize your repositories?
Is there one right answer?
Not really? The book mentions it often comes down to Conway's law. Your org structure influences your system design. Some for poly repos maybe one repo for the cluster platform stuff and separate reapers for each application. Others go for large monerapos containing everything.
Pros and cons to both.
I imagine definitely CALLI repos give teams more autonomy, but can be harder to coordinate. LONO repos make cross cutting changes easier, but require more tooling and discipline.
What about storing rendered manifests you mentioned that earlier, Yes.
The rendered manifest pattern. Instead of rgo CD pulling helm charts or customized bases and rendering them.
On the fly, you pre render them in your CI pipeline and commit the final YAML to get exactly.
The benefit is absolute clarity. What's in GET is exactly what rgo CD will apply. No surprises from template rendering logic changing between environments. It makes auditing and diffing much simpler.
Interesting trade off, more CI work, but simpler.
Geit ops state right and related to get structure is the recommended GitOps workflow. The book strangly advocates for trunk based development as.
Opposed to something like gitflow with long lived develop staging main branches.
Yes, the problem with long lived environment branches in a geitops world is that environment specific configuration like replica accounts or URLs tends to diverge. Merging between these branches becomes a nightmare, ah.
Because you don't want stagings can fig merged into production exactly. Trunk based development where changes flow quickly to the main branch and configuration differences are handled by tools like Customize or helm values outside of Git. Branching generally works much better with the gt ops model.
Good practical advice Where can people go to learn more or get involved?
The community is really active. The CNCF slack has hashtag giittops and hashtag rgo CD channels, which are great places to ask questions.
Okay, CNCs slack.
Attending the public Argo project community meetings online is also a good way to see what's happening and check out projects incubating in Argo project labs. Likewise, there's the rgo CD image updater, which helps automate updating container image tags and get and a newer, more holistic project called cargo is emerging focused on orchestrating promotions and progressive delivery across different stages in a geit ops environment. It looks very promising.
Lot's happening. Okay, let's try to wrap this up. We've really unpacked rgo CD and get ops today from weave workes, fat finger origin story, the core git ups principles.
Declarative version pulled reconciled to rg CD's architecture, the controller pattern, how you interact via UI and CLI managing applications with sources, destinations, sink policies, hooks, waves.
All the way to advance topics like security, multi cluster with apsets, ha charding extensibility.
Right, you should now have a really solid foundation. This deep dive gives you that understanding of how it all clicks together.
You've definitely got more than just the buzzwords, now practical insights, some of those surprising details like the application health check default hopefully helping you navigate get ups with more confidence.
So here's a final thought to leave you with. If GIT truly becomes the single source of truth for our system's desired state, how might that change the way we think about all kinds of operational changes in our organizations beyond just application deployments. What else could we managed declaratively via get It's.
A powerful shift in thinking, moving from imperative actions to declare of state management. Definitely something to moll over. And of course check out the book rg CD up and Running for even more detail and engage with that vibrant GitOps community.
Absolutely, there's always more to explore. Thanks for joining us on the deep dive.
