Cross-cluster Argo Rollouts with ArgoCD
Over a sufficiently long time span, the number of Kubernetes clusters approach (and exceeds!) n where n = the number of things an engineers can reasonably grok. ArgoCD helps reduce cognitive load through a hub-and-spokes model.
Sooner or later a production incident breaks the proverbial camel's back and someone - probably SRE - raises progressive deployments using Argo Rollouts for canary or blue/green deployment. Now Argo Rollouts is an Kubernetes operator that is deployed to each cluster. It comes with two end-user (that is, developer) interfaces out-of-box: a CLI plugin for kubectl and a web UI. The web UI is easy to use but lacking in some important features - chiefly that it does not support authentication or RBAC. The CLI is powerful but requires cluster access and only allows developer to interface with one cluster at a time.
Surely we can do better with some Argo* vertical integration!
Designing a solution
We now need to balance two goals:
- Minimize the number of places developers manage their applications.
- Allow developers to see details of progressing Rollouts.
We can leverage ArgoCD as the "single pane of glass" for viewing and managing Rollout
s using the Rollouts Extension and ArgoCD Application Actions respectively. This allows us to leverage existing ArgoCD tooling (SSO for web and CLI, RBAC) and reduces the need for developers to access the cluster where the application is deployed.
Installing the extension
ArgoCD recently introduced argocd-extensions-installer to download and install extensions at server start time. Using the Helm chart, it's easy to install the extension:
server:
...
extensions:
enabled: true
extensionList:
- name: rollout-extension
env:
- name: EXTENSION_URL
value: https://github.com/argoproj-labs/rollout-extension/releases/download/v0.3.4/extension.tar
You can verify at server startup that the extension is installed correctly.
RBAC
Assuming you're not living fast and loose with *
permissions, you can explicitly grant your developers permissions to perform actions on the Rollout object.
p, role:team-my-team, applications, action/argoproj.io/Rollout/*, my-project, allow
g, [my-team-okta-group], role:team-my-team
Putting it together
Let's use the demo app and start with a simple rollout strategy.
strategy:
canary:
steps:
- setWeight: 20
- pause: {}
- setWeight: 40
- pause:
duration: 10
- setWeight: 60
- pause:
duration: 10
- setWeight: 80
- pause:
duration: 10
Pushing a change (specifically: a tag update) will cause Argo Rollouts to set the canary weight to 20% of 5 pods (i.e. 1 pod). ArgoCD marks the Application as Suspended
since it understands Rollouts status via resource healthchecks.
Clicking into the Application allows us to view the Rollout
object. We can see the pod distribution clearly.
But if we click on the Rollout
object we can see a new Rollouts tab that shows a read-only view of the Rollouts dashboard!
Now we need to manually resume the Rollout
. Going back to the main Application view, we an click on the hamburger menu on the Rollout
object and see available actions. We see the expected Rollout
specific actions including Abort, Promote-Full, and Resume.
Beneath the curtains, an "action" is actually ArgoCD is modifying the Rollout
object on the target cluster.
On clicking Resume we can watch the Rollout
proceed.
Easy-peasy! We can use ArgoCD as our hub for progressive rollouts across multiple Kubernetes clusters! 🎉🎉🎉