Building a multi-cluster GitOps setup with FluxCD

For my Kubernetes-based Smart Factory projects at MaibornWolff, GitOps is the way to go. It makes developing and operating complex distributed platforms manageable and easy. In an earlier article titled Why I use GitOps I explained in detail what GitOps is and what advantages I see in it. In this article I want to expand on that and explain how we are doing GitOps with FluxCD.

What is GitOps

But first let's recap some basics: GitOps is a philosophy and concept for managing infrastructure and software deployments. It uses Git as a central and single source of truth that contains a complete and declarative description of a system (infrastructure and applications). Changes to the system can only be made through Git.

GitOps systems use a continuous reconcile loop, meaning they compare the intended state from Git to the real state of the system all the time. And bring it forward (or back, if you made manual changes bypassing the process) to the intended state.

FluxCD

To build a GitOps-based platform we need a GitOps tool. For my projects I rely on FluxCD. It is a lightweight tool (it calls itself the GitOps toolkit, gotk for short) composed of a set of Kubernetes controllers that handle different aspects of a Continuous Delivery workflow.

For our purpose the relevant controllers are:

  • source-controller: Is in charge of pulling and providing data from repositories that contain the declarative state to be applied. Sources can mainly be Git repositories, Helm repositories and OCI repositories.
  • kustomize-controller: It takes Kubernetes manifests from sources, pairs them with Kustomize for patching, adds simple variable substitution and then applies the resulting manifests to Kubernetes in a continuous loop.
  • helm-controller: Takes Helm Charts from sources and deploys them to Kubernetes using the Helm tooling.

Each controller provides a number of Kubernetes Custom Resource Definitions (CRDs) to configure and control Flux. The important ones are GitRepository (source for manifests from a Git repository), HelmRepository (source for Helm Charts), Kustomization (manifests to be applied from a source using Kustomize) and HelmRelease (Helm Chart from a repository to be applied).

With just these three controllers and four CRDs we can build an easy-to-use but flexible and powerful GitOps deployment system.

The other prominent GitOps tool is ArgoCD. In many aspects it is similar to FluxCD but I find it to be more complex to use. It thinks more in terms of independent Applications that are deployed and managed and less in terms of a coherent platform. In my opinion it moves away a bit from Git being the sole and central point of truth and interaction for a system. It is a great tool, but I personally find FluxCD to be easier to use and the better building blocks for my platforms.

Situation

Before I start describing our approach with GitOps, I need to give some context on the architecture of our Smart Factory platforms. We generally have one cluster per factory/plant/location that is running on-premise at the location. This cluster hosts a machine connectivity solution like Cybus Connectware or HiveMQ Edge to get data from machines running in the plants. Also any workloads run there that are production-critical (in the sense of, if it is disconnected, machines stop and the company loses money). And ones that require low latency or are too bandwidth/storage-hungry to deal with a (limited or expensive) cloud connection. If this plant cluster cannot directly communicate with machines, we deploy additional Edge devices close to the machine. Close meaning both physically close and from a network perspective, often they sit in the control cabinet of the machine they connect to. We try to also run Kubernetes on these to have the same runtime and API everywhere. Collected (and potentially already locally processed) data is sent to a central cloud cluster that hosts the majority of workloads. There might also be additional cloud clusters for special purposes or use cases.

For development and testing purposes the entire cluster landscape exists in several stages. Besides a production stage/environment, we generally also have a Dev/Test environment (although not always in all locations and normally with a smaller scale and fewer resources). And, depending on the needs of the customer, also a QA/staging environment that is used for testing/integrating with production-like data and resources.

In all these clusters and stages we, in most cases, also support multiple teams / tenants / use cases per cluster.

Depending on the Kubernetes distribution used we often rely on a management cluster and workload cluster approach where there exists one central management cluster that uses Cluster API (CAPI) to manage all the actual clusters running workloads. Giantswarm is a company we like to work with that uses CAPI as the basis of their implementation and provides a fully-managed Kubernetes platform. There are of course also other products like Nutanix NKP or Rancher.

Structuring a Git repository

With the Kubernetes situation out of the way, let's look at how to actually structure a GitOps repository with FluxCD.

What we want is a simple and modular structure that supports multiple clusters and does not require clusters to be completely alike. What we go with are the top-level folders apps (contains a folder per component/application with all its manifests) and clusters (contains a folder per cluster and pulls in apps to be deployed in that cluster). It looks like this:

├── apps
│   ├── cert-manager
│   ├── ingress-nginx
│   └── prometheus-stack
└── clusters
    ├── cloud-dev
    ├── cloud-prod
    ├── munich-dev
    └── munich-prod

The application manifests can be as simple as a HelmRelease to install a Helm chart from an external repository if these are provided by the application developers. Or as complex as custom written Kubernetes manifests utilizing Kustomize. All the complexity needed to install a specific application is encapsulated in that manifest folder.

To deploy an application in a specific cluster, we place a Kustomization manifest in the cluster folder that points to the app folder. A simple one could look like this:

apiVersion: kustomize.toolkit.fluxcd.io/v1
kind: Kustomization
metadata:
  name: ingress-nginx
  namespace: platform
spec:
  interval: 10m
  timeout: 1m
  sourceRef:
    kind: GitRepository
    name: platform
  path: "./apps/ingress-nginx" # The app to install
  prune: true
  postBuild:
    substitute: # cluster-specific configuration for the app
      clusterName: "cloud-dev"

We use postBuild variable substitutions to parameterize apps per cluster. If the changes are too complex for simple variables, we can also utilize Kustomize patches.

Sometimes the specifics are in the form of a Secret or ConfigMap. These are then also placed in the folder.

The Kustomizations act basically as pointers, pulling in the apps we want for a specific cluster.

FluxCD needs a source to get the manifests from, in our case a Git repository. This is declared using a GitRepository resource:

apiVersion: source.toolkit.fluxcd.io/v1
kind: GitRepository
metadata:
  name: platform
  namespace: platform
spec:
  interval: 2m
  url: ssh://git@github.com/some-company/gitops-platform.git
  ref:
    branch: main
  secretRef:
    name: github-auth

Kustomizations can then reference this repository as I have already shown above.

A folder for one cluster could look like this:

cloud-dev
├── secrets
│   ├── cert-manager-credentials.enc.yaml
│   └── github-auth.enc.yaml
├── cert-manager.yaml
├── ingress-nginx.yaml
├── kustomization.yaml
├── prometheus-stack.yaml
└── sync.yaml

Handling sensitive information

Sensitive information is part of any Kubernetes setup. Be it a TLS client certificate to connect with an MQTT broker or credentials for a database. To have a unified setup and not distribute information between several systems, we want to also manage such information via GitOps. But of course we don't want to store sensitive information in clear text in Git.

This is where SOPS comes in. It allows us to encrypt and manage files using secure external keys so we can safely store them in Git. FluxCD has an integration with SOPS and can decrypt files on-the-fly during the reconciliation loop if provided the key. The key solution we default to are age keys, a modern, and simple implementation of asymmetric public-private key-pair encryption akin to GnuPG. The keys need to be provided to FluxCD as a Secret. Age keys are easy to handle and require no extra infrastructure, so are ideal to get started.

By convention I always name any SOPS-encrypted secrets with a suffix of .enc.yaml to make clear what they contain. If I need to decrypt a secret on disk, I always use a suffix of .clear.yaml and add that suffix to the .gitignore of the repo to avoid ever accidentally committing the clear text data.

If the customer has a system in place or wants the extra bit of security, we can also couple SOPS with a key management service from a cloud provider (AWS KMS, Azure Key Vault, GCP KMS). This helps to avoid having to secure and distribute the age key. As the encryption backend can be changed, I generally prefer to get started with age keys and later switch if desired.

Why do we not simply store all sensitive information directly in a service like Azure Key Vault and use a tool like the External Secrets Operator to provision the secrets? For one this introduces an additional component (the external-secrets-operator) that needs to be deployed and configured. But the main argument for me is that by storing the information in Git we have everything in one place. The secrets are also versioned (and via Git commit history also audited), right alongside all the rest of the manifests. Which just makes the system easier to reason about and removes a potential failure point and therefore complexity.

But in my opinion the best way to make the system more secure is not by storing secrets elsewhere, but by not having credentials that need to be stored at all. The way to go here is a feature I already wrote about in an earlier blog post, called Workload Identity. If pods can just use their Kubernetes-provided identity to authenticate with external systems, there are no credentials to be handled, or compromised or stolen. Thus making the whole system more secure and easier to handle.

Another option can be to create and manage credentials dynamically. For example the hybrid-cloud-postgresql-operator I wrote a few years ago. Based on custom resources applied in Kubernetes it provisions database instances using several backends. The trick is that the operator dynamically creates the passwords for the databases it manages and stores them in secrets with user-defined names. Thus there is no need to statically provision a secret using either SOPS or an external secrets manager. There is still the risk of the credentials getting stolen, so it's not a perfect solution.

Bootstrapping and self-management

With HelmRelease and Kustomization manifests we have FluxCD managing our applications and components. But we do not want to have to manually add these manifests to our clusters. The solution is to define an entrypoint using another Kustomization that points to the folder for a specific cluster. Thus we only need to apply the manifests for the GitRepository and this top-level Kustomization. FluxCD will pull in and reconcile the manifests for our cluster and then recursively deploy the actual applications. I refer to the combination of these two manifests as a sync point.

And if we place these root sync point manifests in the folder for the cluster, the entire setup becomes self-managing. Meaning FluxCD will not only apply changes to any of the app manifests, it will also do so for the entrypoint manifests themselves. If we were to change the interval on the GitRepository, FluxCD would pull in the change, apply the manifest and then use it. This removes the need to directly interact with the Kubernetes API to make changes, we can do it all via Git.

The process of initially kick starting FluxCD is called bootstrapping. It involves manually applying the minimal set of manifests (GitRepository, root Kustomization, git credentials and SOPS key Secrets) so that FluxCD can start doing its work.

The bootstrap process involves installing FluxCD itself first. To get around the chicken-and-egg problem there is no alternative but to manually apply the manifests that are needed to deploy FluxCD. Afterwards we can make FluxCD manage itself. Either we completely separate these manifests out in Git alongside a GitRepository and Kustomization sync point pointing to itself. Or we treat FluxCD as another app we pull into our clusters. This can increase the risk of accidentally taking down FluxCD (and potentially the entire cluster) so safeguards should be applied. Having FluxCD manage itself is safe as the job of the FluxCD controller process is done as soon as the manifests are applied. Rolling out the changes (e.g. by restarting controller pods) is handled by Kubernetes itself and FluxCD is not needed for that.

If the setup uses the management cluster approach with Cluster API, we can integrate with that to automate the bootstrapping. Kustomizations can be configured to be applied not to the local Kubernetes cluster but to a remote one by pointing it to a Secret containing a KubeConfig. Such Secrets are created by Cluster API when provisioning clusters. We can use that and place a bootstrapping Kustomization in the management cluster to tell the management cluster FluxCD to provision both the workload cluster FluxCD itself and the bootstrapping set of manifests to a newly created cluster. The rest of the work is then done by the new FluxCD instance in the workload cluster.

Self-management enables easy SOPS key rotation

One cool feature of having a self-managing GitOps setup with SOPS is that we can do key rotations using only Git. As FluxCD needs the SOPS key available as a Secret, it makes sense to also manage them via FluxCD just as any other manifest. It may seem a bit weird to have the Secret containing the SOPS key encrypted using that same key, but it fits nicely into the self-management aspect. It also enables easy key rotation. SOPS allows us to use multiple keys at once to encrypt with. Thus the process to rotate keys is very simple:

  1. Generate a new key
  2. Encrypt all Secrets in Git with both the old and new key
  3. In Git change out the content of the SOPS key Secret with the new key, but make sure to encrypt it with the old key
  4. Commit and push the changes
  5. FluxCD will pull the changes and apply them, thereby switching out the SOPS key to use
  6. Wait for one reconcile loop to finish
  7. Re-encrypt all Secrets in Git with only the new key and commit the changes
  8. As FluxCD already has the new key, it can read all the encrypted manifests

Using this process a key rotation is a matter of minutes and two Git commits. No direct access to Kubernetes is needed.

Moving changes through environments

We have our structure with apps and clusters. But how do we develop and test changes and features and propagate them through the clusters/environments?

The way I prefer to do things is to have a separate sandbox cluster and have FluxCD in that cluster work from a feature branch. This corresponds with software development which also often works with feature branches and sandbox environments.

The sandbox cluster should be considered ephemeral, meaning it can and should be torn down regularly and provisioned anew. Not only does this save on cloud costs if it is not used, it also ensures that the GitOps manifests are still in a state to bring up a functioning cluster from nothing. Which in established setups can often be a problem as, even with a declarative setup, sometimes assumptions sneak in that hold for an existing setup that is merely updated but not for a completely new cluster. It also ensures there were no resources added to the setup manually.

Once the feature is developed and tested, the branch can be merged to main just as with software. But we probably don't want to roll out the change to all clusters at once. Be it to give development teams time to test their applications with the changes. Or to take heed of maintenance windows for production clusters (especially in industrial environments this can be tricky as not all services are highly available or allow for downtime-free deployments).

If the changes are only in the cluster-specific folders, it's easy, we just make the change for a cluster when we want to deploy it.

For changes in the app manifests that are shared between clusters we have two main options:

  • We can use a feature flag approach and enable the change cluster by cluster. This makes roll out easy. But it needs to be tested thoroughly to ensure the change doesn't accidentally leak from the feature flag. And depending on the change it might also be hard to put it behind a feature flag, especially if it concerns the structure of the deployment.

  • The other approach is to have a copy of the affected app folder with the changes and then one by one deploy to clusters by switching the app path a cluster points to. This is also a good way to handle major application updates that require configuration changes and cannot be handled by just templating a version (of a Helm chart or container image). Once the roll out is complete, the old app folder can just be deleted.

You might be tempted to have a setup where each cluster is deployed from a different Git branch. But I have found this to be an inferior solution. On the one hand, propagating changes (and not forgetting anything) can be complex and lead to lots of merge conflicts. It is also easier for clusters to diverge. On the other hand it makes it harder to know the state of the entire platform with all clusters. The approach I describe in this post has all clusters (except sandbox) always on the main branch. Therefore there is just one location (the main branch) that holds the truth for all clusters.

There is one situation where using branches (aside from feature branches for the sandbox cluster) can be a good idea. If you make major breaking changes to the structure of the GitOps repository that cannot be rolled out in the manners described above. Then it makes sense to have a branch with the new structure and switch clusters to this branch one by one. But this needs to be a temporary setup for the migration, and afterwards all clusters should again point to the main branch.

Multiple teams

In the beginning I mentioned that our platforms are built for multiple teams (or tenants). How can we represent this with FluxCD? We do this by introducing another layer on top of the GitOps setup.

This tenant layer uses the same structure of apps and clusters, but instead of applications it manages sync entrypoints for different teams. So basically a GitRepository, Kustomization and git credentials and SOPS key Secrets. Then each team has its own GitOps repository that (at least at the top layer) must follow the same structure of having a folder per cluster. The tenant manifests simply point to the repository of the team and the folder for a specific cluster. It might seem over-complex, but this way we have the similarity in structure making it easier to handle multiple repositories. And it gives teams freedom to handle deployments and manifests they way they want without having to depend on the platform team.

To make the structure unified I also treat the platform itself as just another tenant. Each part of the GitOps system having the same structure in my opinion reduces complexity and makes handling easier.

One big aspect of having multiple teams in one cluster is separation and isolation of the teams and their workloads. This starts at the FluxCD level to make sure teams can only deploy into their domain, goes further into permission management via Kubernetes RBAC (Role-based-access-control) and includes the network layer and potentially Service Meshes. But this is an entire topic in itself so I will not cover it here.

External resources

No platform stands in isolation. For clusters in the cloud you might want to use some managed services from your provider (e.g. a Managed Kafka or a Managed Database). So the question is, how to integrate these with the GitOps setup?

Terraform (or its open-source fork OpenTofu) is the most widely used tool to manage cloud resources. It is a tried and true approach for cloud management. Using the Kubernetes Provider for Terraform we can build a bridge between both ecosystems and for example provision Kubernetes Secrets using Terraform with credentials for the database we created using another Terraform provider.

But for me this approach has one big drawback: We have two different languages and approaches for one setup. It is not just HCL vs YAML, but also FluxCD controllers running in the clusters vs separate Continuous Delivery pipelines to run Terraform. And this diversity adds complexity I try to avoid.

An alternative is the use of Crossplane. It runs as operators in Kubernetes and exposes CRDs to manage cloud resources. As an example, the following manifest (adapted from the crossplane AWS Quickstart) can be used to provision an AWS S3 Bucket:

apiVersion: s3.aws.upbound.io/v1beta1
kind: Bucket
metadata:
  name: my-test-bucket
spec:
  forProvider:
    region: eu-central-1
  providerConfigRef:
    name: default

This way a cloud resource becomes just another Kubernetes manifest that can be managed via FluxCD. Giantswarm does a great job there and automatically provisions the necessary Crossplane configuration to deploy resources into a clusters cloud account as part of the cluster setup.

Conclusion

Hopefully I've been able to give some insights into how I structure GitOps setups with FluxCD. Keep in mind this setup is based on my experience and specialized for the Smart Factory Platforms that I build at MaibornWolff. For other areas the setup might look different. I have also not gone into every detail possible. Because the details change all the time as they depend on the customer and the situation.

This setup has a few underlying base concepts:

  • Git represents the single source of truth for the entire system. In the event of a disaster the entire system must be able to be restored from Git (excluding data storage of course for which backups need to exist).
  • Not only must Git be the single truth, it should also always clearly show what is deployed where in which configuration. If I cannot get a complete picture of the intended setup of my platform just by looking at the main branch in Git, I did something wrong.
  • The structure is built around small modular units that are easy to reason about and use as simple building blocks.

I have used this type of setup with a range of former and current customers at MaibornWolff, in each case successfully. The details vary from project to project and of course such a setup evolves, but the basic concepts stay the same.