13 hours ago
Making things easier and foolproof
Before we start, I think a textual explanation of GitOps might be a little overwhelming to understand (at least it was for me while I discovered this topic). Here’s a great video to build a base knowledge of GitOps before we move forward.
Traditional approaches to configuration management in Kubernetes often involve manual processes, scripts, and direct manipulation of cluster resources. These methods can be error-prone, difficult to audit, and may lead to inconsistencies between different environments. As Kubernetes deployments scale, the need for a more automated, reliable, and scalable approach becomes critical.
GitOps builds upon DevOps best practices, offers a modern solution to these challenges. By leveraging Git as the single source of truth for declarative infrastructure as code (IaC), GitOps provides a robust and auditable way to manage Kubernetes configurations.
Complex YAML
Although YAML makes basic setups simple, handling complex and large-scale deployments can often become difficult. Because of YAML’s sensitivity to whitespace, incorrect indentation is a typical mistake that can result in serialisation issues that are not always obvious. Because YAML is so verbose, it can be challenging to identify the cause of problems by looking at the files alone, particularly when a single manifest contains definitions for several Kubernetes objects. The semantic linkages between various Kubernetes resources, including Deployments, Services, and Ingresses, can be complex and challenging to manage at scale, adding to the syntactic complexity. YAML’s apparent simplicity of use may conceal the underlying complexity that appears in production settings, which poses a serious obstacle to efficient configuration management.
Resource Management and Overprovisioning
To avoid last-minute bottlenecks and ensure application availability, there is often a tendency to overprovision cloud resources. While this approach can handle intermittent traffic spikes, it also inflates management overhead and production costs due to administering resources that are not in constant use. The reactive nature of autoscaling, relying on the cloud provider’s designated infrastructure, can lead to inefficiencies if not coupled with proactive resource monitoring and rightsizing strategies. Optimising resource usage in large Kubernetes clusters with diverse application requirements becomes increasingly complex, demanding careful configuration and continuous monitoring.
Security Misconfiguration
The distributed and often short-lived nature of cloud-native applications in Kubernetes introduces numerous security configuration complexities. Common misconfigurations can create significant vulnerabilities, leaving clusters exposed to potential threats. Setting the pod policy AllowPrivilegeEscalation to True is a frequent mistake that enables container processes to gain higher privileges than their parent processes, violating the principle of least privilege and potentially granting unlimited access to resources. Similarly, neglecting to set resource limits for third-party integrations, such as monitoring and security operators, can lead to resource-intensive integrations consuming substantial cluster resources, potentially causing out-of-memory errors and impacting application performance. Furthermore, the absence of network policies can leave pods with no networking restrictions, unnecessarily increasing the attack surface and allowing compromised containers to potentially direct malicious traffic to sensitive pods without any filtering. The intricate nature of Kubernetes security requires a proactive and systematic approach to configuration management to mitigate these risks effectively.
Networking and Visibility challenges
Network visibility and interoperability present major issues when managing networking in large-scale, multi-cloud Kubernetes systems. The dynamic nature of Kubernetes settings, where pods and services are continuously generated and deleted, makes traditional network operating techniques that rely on static IP addresses and ports unsuitable. Because of its transient nature, standard monitoring methods are also limited and have blind spots because they can’t keep up with the speed at which containers are updated and changed. This might result in visibility gaps or even blackouts. The operational complexity is further increased by the changing IP addresses of pods and the requirement for smooth communication between containers across many hosts, which necessitates careful design of load balancing, service discovery, and network restrictions.
Cluster Lifecycle Management and Upgrades
It takes a lot of work to manage Kubernetes cluster lifecycles, which include provisioning, scaling, upgrading, and decommissioning. Operations teams may find it difficult to keep up with upstream Kubernetes’ quick release cadence, which includes several updates and patches every year. The complexity is increased by making sure that patches and updates are carried out accurately, on schedule, and without interfering with ongoing tasks. A major operational problem is applying security patches and upgrades at scale across several clusters without affecting performance or disrupting service.
Configuration Drift
Configuration drift, where the actual state of the Kubernetes environment deviates from the desired state defined in configuration files, is a common challenge in manually managed clusters. This can occur due to manual interventions, script executions, or inconsistencies in deployment processes. When the live environment no longer matches the intended configuration, it can lead to unpredictable behavior, deployment failures, and difficulties in troubleshooting issues. The lack of a centralized and automated mechanism to ensure configuration consistency across the cluster can result in environment sprawl and increased operational risks. Addressing configuration drift requires a system that continuously monitors the environment and automatically reconciles it with the desired state, ensuring that the Kubernetes cluster remains in the intended configuration.
At its core, GitOps is the practice of managing infrastructure and application configurations using Git repositories as the authoritative source. It’s an evolution of Infrastructure as Code (IaC) that utilises Git’s version control capabilities to track and manage all changes to the desired state of the system. GitOps employs Git not just for storing configurations but also as the control plane for applying these configurations to the live environment. This methodology ensures that the system’s cloud infrastructure is immediately reproducible based on the state of the Git repository. Changes to the system are made through Git pull requests, which, once approved and merged, automatically trigger the reconfiguration and synchronisation of the live infrastructure to match the repository’s state.
Declarative Infrastructure (Desired State)
Imagine telling someone what you want a room to look like — the colour of the walls, the furniture arrangement — rather than giving them step-by-step instructions on how to paint, assemble, and place each item. That’s the essence of declarative infrastructure.
In GitOps, you define the desired state of your system, including your Kubernetes resources like Deployments, Services, and Namespaces, in a declarative manner. Typically, this involves using YAML manifests. The system then automatically works to configure itself to match this defined outcome. This contrasts sharply with imperative systems, where you issue specific commands to achieve a state.
Version Control
Commits are made to this repository to implement any changes you make to your applications or infrastructure. This guarantees thorough auditability, exacting tracking, and the vital capability of quickly reverting to earlier stable states in case of emergency. Additionally, the well-known pull request procedure facilitates peer reviews and debates prior to any changes being merged into the main branch, streamlining collaboration on infrastructure modifications.
Automation
When a change is detected (a new commit, a modified configuration), these intelligent agents automatically pull the updated configurations and apply them to your target environment, such as your Kubernetes cluster. This automated process eliminates the need for manual interventions in deployment workflows, drastically reducing the risk of human errors and ensuring unwavering consistency across all your environments.
Continuous Reconciliation
If any discrepancy or configuration drift is detected — whether due to manual changes, accidental errors, or even failures — these agents automatically spring into action to rectify the situation and re-establish conformity. This continuous feedback loop provides a powerful self-healing capability, allowing your system to automatically revert to a known good state in the face of adversity.
This principle is paramount for maintaining the stability and reliability of your Kubernetes environment, proactively preventing and correcting any deviations from your intended configuration.
This example demonstrates a basic GitOps workflow using Argo CD to deploy applications to Kubernetes. Initially, Argo CD deploys the application based on configurations from a Git repository, showcasing the synchronisation process. Argo CD then synchronises the cluster to reflect these updates, automating the deployment of the new image. This process ensures the cluster’s state aligns with the Git repository, offering benefits like increased automation, improved consistency, and enhanced auditability. The deployment history further aids in tracking changes over time. While this example is simplified, it lays the foundation for understanding GitOps principles in managing Kubernetes deployments.
Step 1 — Creating our Repository
We create the following manifest and simply push the same to our GitHub repository under the manifest folder.
# initial-deployment.yamlapiVersion: apps/v1
kind: Deployment
metadata:
name: sample-app-deployment
spec:
replicas: 2
selector:
matchLabels:
app: sample
template:
metadata:
labels:
app: sample
spec:
containers:
- name: sample
image: nginx:stable-alpine-perl
ports:
- containerPort: 80
Now, I have kept things simple with a single deployment creating two replicas for my nginx container, however, things are a little more complex when it comes to live environments. Although the steps shall remain the same.
Step 2 — Install Argo CD on your Kubernetes Cluster
The following commands are available on the official ArgoCD documentation on https://argo-cd.readthedocs.io/en/stable/getting_started/
kubectl create namespace argocd
kubectl apply -n argocd -f https://raw.githubusercontent.com/argoproj/argo-cd/stable/manifests/install.yaml
Since Argo CD’s API server isn’t exposed externally by default, you have a few options to access the UI:
1. Port Forwarding (for local access):
This is the quickest method for temporary access, like when you’re experimenting locally.
kubectl port-forward svc/argocd-server -n argocd 8080:443
This command forwards port 8080 on your local machine to the 443 port of the argocd-server service within the argocdnamespace. You can then access the UI by opening your web browser and navigating to https://localhost:8080. You might see a warning about an insecure connection because of the self-signed certificate; you can usually bypass this for local testing.
2. Service Type LoadBalancer (for cloud environments):
If you’re on a cloud provider with a LoadBalancer service, you can expose the Argo CD server using this:
kubectl patch svc argocd-server -n argocd -p '{"spec": {"type": "LoadBalancer"}}'
Your cloud provider will provision an external IP address for the Argo CD server. You can find this IP using:
kubectl get svc argocd-server -n argocd -o=jsonpath='{.status.loadBalancer.ingress[0].ip}'
Access the UI using the obtained IP address in your browser (e.g., http://). Note that this might also have a self-signed certificate.
3. Ingress (for more robust access):
For a more production-ready setup, you’d typically use an Ingress controller to expose Argo CD with proper routing and potentially TLS. This involves configuring an Ingress resource for the argocd-server service. The specifics depend on your Ingress controller (e.g., Nginx, Traefik). Refer to your Ingress controller’s documentation for configuration details.
Once you’ve accessed the UI via one of the methods above, you’ll need to log in.
- The default username is
admin. - To get the initial password, run the following command:
- Bash
kubectl -n argocd get secret argocd-initial-admin-secret -o jsonpath='{.data.password}' | base64 -d
This will output the initial admin password. It’s highly recommended to change this password immediately after your first login via the “User Info” section in the UI or using the Argo CD CLI: argocd account update-password.
Step 3 — Adding a Repository
Argo CD needs to know where your application manifests (e.g., Kubernetes YAML files, Helm charts, Kustomize configurations) are stored. You add these as repositories.
- In the Argo CD UI, navigate to Settings in the left sidebar and then click on Repositories.
- Click the Connect Repo button. You’ll have options for HTTPS, SSH, GitHub App, and Google Cloud Source.
- Fill in the required details based on your repository type:
- Repository URL: The Git repository URL (e.g.,
https://github.com/your-org/your-repo.git). - Connection Type: Choose HTTPS or SSH.
- Credentials (if private):
- HTTPS: Enter your username and password/access token.
- SSH: Provide the SSH private key. Ensure there are no extra line breaks when pasting.
You can also choose to save the credentials as a Credential Template if you have multiple repositories with similar prefixes and the same authentication. Click Connect to test the connection and add the repository.
Step 4 — Adding the Application
Once you have a repository connected, you can define Argo CD applications. An application tells Argo CD what to deploy from which repository, to which Kubernetes cluster and namespace, and with which configurations.
Using the UI:
- In the Argo CD UI, click on Applications in the left sidebar.
- Click the New App button in the top-right corner.
- Fill in the application details:
- Application Name: A unique name for your application.
- Project: Choose the Argo CD project for this application (usually
defaultto start). Projects help with RBAC and organization. - Sync Policy: Choose between
Manual(you’ll need to manually trigger synchronization) orAutomatic(Argo CD will automatically try to sync when it detects changes in the Git repository). WithAutomatic, you can also enablePrune Resources(remove resources in the cluster that are no longer in Git) andSelf Heal(revert manual changes in the cluster to match Git). - Source:
- Repository URL: Select the repository you added in the previous step.
- Revision: Specify the Git branch, tag, or commit to use.
- Path: Enter the path within the repository where your Kubernetes manifests, Helm chart, or Kustomize configuration is located.
- (Optional for Helm): Chart, Values Files, Values (inline YAML).
- (Optional for Kustomize): Kustomize Path.
- Destination:
- Cluster: Choose the target Kubernetes cluster.
https://kubernetes.default.svcusually refers to the cluster where Argo CD is running. You can add and select other clusters if needed. - Namespace: Specify the Kubernetes namespace where the application should be deployed.
Click Create to create the application. Argo CD will then attempt to sync the application based on your chosen sync policy.
This shall automatically sync your repository and pull the desired state from the same, and deploy it to the current state of your Kubernetes cluster (currently empty)
Now the fun begins. Let’s make a minor change (change the nginx image tag) and push the manifest to our GitHub repository. In case of larger environments or corporate settings, these changes shall be PR reviewed and all of the strictest protocols followed.
# initial-deployment.yamlapiVersion: apps/v1
kind: Deployment
metadata:
name: sample-app-deployment
spec:
replicas: 2
selector:
matchLabels:
app: sample
template:
metadata:
labels:
app: sample
spec:
containers:
- name: sample
image: nginx:stable-alpine3.21-perl #Changed Tag
ports:
- containerPort: 80
Once again, this becomes the new desired state of the application and is recognised by ArgoCD, which automatically triggers a rollout of this current application state (if automated methods are configured).
Boom! There’s the new image. But hey, what if this new image being deployed breaks my application? What if other dependent articles are not compatible anymore, hampering my application? Rollback!!
ArgoCD stores a record of all the revisions deployed. Simply select and deploy! Wow! That is simple.
This simplified example needs to be significantly improved for corporate GitOps deployments, with an emphasis on production-grade robustness. Advanced deployment techniques like blue/green and canary releases, thorough automated testing and validation, and strong security hardening with RBAC and secure secrets management are some examples of these improvements. Additionally, for deep insights, corporate setups require automated promotion processes, sophisticated environment management, and integration with monitoring and observability technologies. Along with stringent audit trails and compliance procedures, automated rollback techniques and disaster recovery plans become essential for guaranteeing scalability and performance when managing heavy traffic and extensive deployments. But there is one way you can mess things up:
Source Credit: https://medium.com/google-cloud/back-to-basics-understanding-gitops-for-configuration-management-on-kubernetes-8478f2a8d6e4?source=rss—-e52cf94d98af—4
