zheludov.com:/$ blog argocd-failover
In the realm of modern DevOps practices, GitOps has emerged as a powerful paradigm for managing infrastructure and application deployments. GitOps leverages Git as the single source of truth for declarative infrastructure and applications, enabling teams to use Git workflows for deployment and operations. This approach ensures consistency, traceability, and improved collaboration among teams.
ArgoCD is a popular continuous delivery tool specifically designed for Kubernetes, which implements GitOps principles. It provides a declarative way to manage Kubernetes resources, automatically synchronizing the desired state defined in Git with the actual state in the cluster.
Running ArgoCD in multiple regions can significantly enhance the resilience and availability of your continuous delivery pipeline. Here are some reasons why you might want to consider this setup:
ArgoCD does not natively support active/active configurations due to potential conflicts in the application controller. If two ArgoCD instances are running simultaneously and trying to synchronize changes, they can conflict with each other, causing deployment issues.
To overcome this challenge, we can configure both ArgoCD instances with identical settings but define staggered sync windows. This way, each instance performs synchronization in alternating intervals, avoiding conflicts while ensuring continuous deployment capabilities.
First, deploy two ArgoCD instances in different regions. Ensure both instances are configured with the same settings, repositories, and target clusters. For a detailed tutorial on deploying ArgoCD, you can refer to this guide.
Define sync windows for each ArgoCD instance. The idea is to have one instance perform synchronization during minute 0, 10, 20, 30, 40, and 50, while the other instance handles synchronization during minute 5, 15, 25, 35, 45, and 55. This configuration ensures that they do not overlap, preventing conflicts.
apiVersion: argoproj.io/v1alpha1
kind: AppProject
metadata:
name: example-project
spec:
syncWindows:
- kind: allow
schedule: "0,10,20,30,40,50 * * * *"
duration: "5m"
apiVersion: argoproj.io/v1alpha1
kind: AppProject
metadata:
name: example-project
spec:
syncWindows:
- kind: allow
schedule: "5,15,25,35,45,55 * * * *"
duration: "5m"
After applying the sync window configurations, verify that each ArgoCD instance respects its designated sync window. You can check the synchronization logs to ensure they are performing updates at the correct times without overlapping.
Simulate a failover scenario by taking one ArgoCD instance offline. Observe how the other instance continues to perform synchronizations without any issues. Once the offline instance is back online, ensure that it resumes its synchronization activities as expected.
By configuring staggered sync windows, we can successfully run ArgoCD in an active/active configuration across multiple regions. This setup ensures high availability and failover capabilities without conflicts in the application controller. With this approach, you can leverage ArgoCD's powerful continuous deployment features while maintaining a robust and resilient infrastructure.
Implementing this strategy allows your organization to achieve seamless failover and maintain continuous delivery even in the face of regional outages. If you have any questions or need further assistance, feel free to leave a comment below.
Type 'blog' + Enter -- to get a list of my blog posts.