Turbo-charging Ignition on Kubernetes with GitOps
I have already written several posts about GitOps and how I build Kubernetes Smart Factory Platforms with FluxCD. Today I want to describe how we integrated the Low-Code tool Ignition into our Kubernetes-based platform and how we automated its deployment despite the software not being Kubernetes-native.
This article was written in collaboration with my MaibornWolff colleague and Smart Factory expert Henning Heine. You can also read it from him on LinkedIn.
Intro
Proof-of-concepts are easy. Scaling digital use cases across an entire enterprise with multiple sites worldwide is the hard part of building Smart Factories. It needs the right technology to do that. When it comes to building use cases, a great building block is Ignition from Inductive Automation. Kubernetes is the go-to solution when you want to build a robust and scalable Smart Factory infrastructure. And nowadays a GitOps approach for infrastructure and application deployments has become standard. If a company wants to harmonize their technology stack and quickly build and roll out new digital solutions they need to combine Ignition, Kubernetes and GitOps.
In this article we will describe how a harmonized infrastructure based on Kubernetes can be a key factor to scale Smart Factory use cases and how Ignition as application development platform can be managed and run smoothly within Kubernetes.
What is Ignition
Ignition is a versatile platform for building industrial applications with a low-code approach. Originating from the SCADA domain, Ignition is equipped with an extensive library of hardware drivers for direct interaction with PLCs. In addition to that, it can be easily connected to databases, HTTP resources or a larger Industrial Internet of Things (IIoT) infrastructure via MQTT or several cloud connectors. The Perspective module lets developers build web-based applications for the shop floor. Easy access to data from various sources paired with flexible frontend development functionality and a strong development community make Ignition a good fit for developing Smart Factory applications.
What is Kubernetes
Kubernetes has established itself as the de-facto standard for container workload orchestration. Originally started at Google it has now become a huge and widely used open source project. At its core Kubernetes is responsible for scheduling and orchestrating containers (called Pods) over a fleet of nodes. That means it takes care of finding a suitable place for a Pod. Among other things it looks for a node with enough free CPU and memory to accommodate the Pod. It can limit its search to nodes that doesn’t run other Pods the Pod should not run with (to have high availability in case of node crashes). Or maybe it even looks for a node with special hardware like a GPU for AI workloads.
Besides managing Pods themselves, Kubernetes also takes care of everything needed to run them, like networking, storage, and configuration. The interface for Kubernetes, its API, is completely declarative. Workloads are defined as a set of YAML manifests that describe the resources it needs. It is then the job of the Kubernetes Control Plane to provide the resources and keep them in the described state. This is called reconcile loop, as Kubernetes continuously compares the described state with the actual state and tries to match.
Why Kubernetes for Smart Factories
The IT infrastructure on the OT-side of business usually lags behind the development in other parts of the business. This can be a challenge when Smart Factory use cases are ready to be rolled out in multiple sites, because more often than not the IT infrastructure is quite different from site to site. They were designed at different times, originally built for different companies and then acquired over time or just built by colleagues with different design approaches. Different network segments, different deployment tools, different backup strategies or different solutions for monitoring and logging are just some examples that need to be considered, when a new solution should be rolled out. This all takes up time, slows down the roll-out or can bring it to a halt entirely. Even if an initial roll-out in multiple sites was successful, operating and maintaining the components of a connected factory on different infrastructure takes up a lot of time and is inherently inefficient. This is why we like to build our Smart Factory solutions on a harmonized tech stack based on Kubernetes.
Kubernetes acts as both an abstraction and unification layer. It doesn’t matter if the underlying infrastructure uses bare-metal hosts or a virtualization solution like vSphere or Nutanix. Or if a setup runs in the cloud (AWS, Azure). It (mostly) doesn’t matter what storage or networking solutions are in place either. We can just write our Kubernetes YAML manifests, containers and the Kubernetes API provide the unified abstraction.
What is GitOps
GitOps is a philosophy and concept for managing infrastructure and software deployments. It uses the Git version control system as a central and single source of truth that contains a complete and declarative description of a system (infrastructure and applications). Changes to the system can only be made through Git. In theory, I could completely delete my infrastructure and recreate it from the description stored in Git (excluding data stored in databases and other stateful services).
GitOps systems use a continuous reconcile loop, meaning they compare the intended state from Git to the real state of the system all the time. And bring it forward (or back, if you made manual changes bypassing the process) to the intended state.
With GitOps we can also take advantage of features provided by Git Forges (like GitHub or GitLab). Changes are made in feature branches and get merged via Pull/Merge Requests (PR/MR) which enforce code reviews and approvals. And we get automatic audit records about any changes. If you want to learn more about GitOps in general, I have a blog post explaining what GitOps is and why we use it.
Ignition and Kubernetes
For our Smart Factory Platforms we like to do as much as possible via GitOps, to have it properly automated and versioned. This also includes Ignition.
As Ignition already runs containerized and supports a high availability configuration, deploying it in Kubernetes is not complicated and directly gives us benefits. When we make a change (e.g. updated Ignition version), Kubernetes automatically performs a rolling restart, meaning it will first restart one pod, and only once that is back online it will restart the second one. This keeps Ignition available the entire time. Kubernetes also takes care of scheduling Ignition pods on different nodes (through podAntiAffinity), so even if an entire node crashes Ignition stays available.
Getting to these benefits involved quite some work as Ignition is not (yet) Kubernetes-native.
The first point is building the container image for Ignition. We use several 3rd party modules like the MQTT-Engine module. To avoid having to manually install them or set up some bootstrap process, we just build our own Ignition image that pre-bundles all modules. This also makes sure Ignition and modules are always versioned together.
The second point is utilizing the failover feature of Ignition. You can (and should) deploy Ignition in a high availability configuration with Gateways running in two pods. One acts as the master, the other as a standby. Should the master become unavailable, the standby takes over. We combined this with the automatic status-based routing of Kubernetes Ingress Controllers. But just pointing the ingress controller to both pods doesn't work as then round-robing loadbalancing would be happening. Instead we declare the master Pod as the primary target for the Ingress route we define and expose (so users can reach Ignition by just opening something like ignition.my-cluster.smartfactory.coolcompany and don’t need to fiddle with changing IPs). If this target goes down, the Ingress-controller will fall back to the default route, which we configured as the standby Pod. Once the master Pod is back up again, the controller switches back. For users of the Ignition UI this is seamless as the client automatically handles reconnects and perceived downtime is normally just a few seconds.
The last but very important point is backups. This is two-fold: Our Ignition projects store their data in an external Postgres database which we deploy and manage with a Kubernetes Operator. This operator gives us automatic backups to external Object Storage (in our case AWS S3 Buckets) and easy restore if needed. For Ignition itself we utilize the built-in scheduled backup functionality to create a nightly backup. As it has no means to push these backups to external storage, we use a little trick: We mount an S3 bucket into the container (using the standardized Container Storage Interface (CSI)) as a filesystem and tell Ignition to place its backups there. From Ignitions point of view its just another folder, for us it is storage that is decoupled from the cluster and site. That way backups are automatically stored externally and we can recover, even if the entire cluster is destroyed.
Ignition Development Workflow
The heart of Ignition is not the base software but the custom projects built with it. And we want these to also follow our usual development and deployment processes. This means having separate DEV, TEST and PROD environments and making deployments automated and versioned.
Our developers usually work with Ignition on their local machines so they don't interfere with each other and have direct and full access to everything. To help with setup we have a docker-compose file that spins up Ignition itself along with a Postgres Database instance for storage and an MQTT broker for communication.
Ignition projects are stored as files (mainly JSON for the views and python files for scripts) in a specific folder (data/projects/<projectName>). This means developers can get started quickly by providing these files from Git or uploading an Ignition project backup, and importing a database dump for Postgres.
During development, once a developer is satisfied with their changes, they check the project files into Git. When multiple developers are working on a project, there is a risk of merge conflicts (specifically the central resource.json file that is automatically created from Ignition for each view). Not checking it in is not a solution, because without resource.json newly created views won't be discovered from the gateway. Some people are trying to sanitize the file with additional scripts to reduce merge conflicts. In our projects, the merge conflicts did not annoy us enough to go down that path yet.
Deploying projects GitOps style
Even though Ignition is not Kubernetes-native, it has a nice feature to help us: It will hot-reload project files when they change on disk. We used this to build ourselves a GitOps-style deployment for Ignition projects.
Our developers manage the project files in Git, they will commit and push all changes they make.
So we wrote a little custom Python tool (aptly named the ignition-project-deployer) that runs alongside Ignition (as a sidecar) and monitors Git. Whenever there are changes it will replace the files in the projects folder with the files from Git. The new files will then be automatically loaded by Ignition. But we didn't want this extra tool to deal with all the Git complexity.
And this is where FluxCD comes in. FluxCD already deals with getting (pulling) commits and data from Git repositories. Usually the data is then picked up by other FluxCD controllers to perform deployments in Kubernetes. But in our case it is our own ignition-project-deployer that gets the data and performs a deployment by switching out the files in the Ignition project directory.
Moving through the environments
As already mentioned above, we rely on a deployment model with three architectures: Local development, then (integration) testing on a real cluster, then the actual production environment.
To move a new version of a project to the testing environment, a developer just has to commit and push their changes in Git on the main branch. FluxCD will automatically see the new commits and pull them. It does this because we have configured FluxCD to simply watch the main branch. We could also temporarily switch a single cluster to a different branch. This is very easy as the manifests for FluxCD itself are also managed via GitOps.
They (and colleagues from the customer) can then test the changes. In our Smart Factory projects we aim to have all clusters and environments structured and set up the same. And usually we will either bridge data from the PROD environment into the TEST environment broker, or just connect the same machines from PROD into TEST (assuming the machines can handle double the connections).
By doing so, we can test our development against actual live data, which really helps to make a solution production ready. Once testing is done and colleagues from the customer approved new features for example, we can move the solution to PROD.
The only question that remains is, how to promote changes from the testing to the production environment.
We use Git Tags for that (another approach would be Git Branches, but in our experience this just makes it more complex and error-prone). Once we are satisfied and a version running in the testing environment is ready for production (this will include having the customer verify the changes), a developer creates a Git Tag for the commit in Git. And since we configured FluxCD in the production environment to look for Git tags, it will automatically see this new tag and pull the data for it. Then our ignition-project-deployer takes over.
Not only makes this approach deploying very easy for developers as committing and pushing files is something they would do anyway, and creating Git Tags is a standard operation in any Git UI or tool. We also for free get clear versions that we could easily roll back if a problem occurs, and we have an audit record, plus we can directly see which developer made what changes.
Watch out! It needs more than just the project files
Besides the above described project files there are other important components to a working Ignition setup, namely:
- Tags and User Defined Datatypes (UDTs). UDTs are custom data structures, that define templates for more or less complex sets of tags.
- Images and Icons
- Gateway Configurations
Unfortunately, these components currently cannot be handled within the GitOps workflow. This is not a big issue during daily development but can be annoying if larger changes on UDTs need to be deployed to multiple Ignition instances. After initial setup we found that images, icons and gateway configurations do not change often. So far, this did not cause much headache in our workflow.
To keep a record of the changes, we export UDTs from the designer and store the JSON-file as well as images, icons, and a local gateway backup in a Git repository. This backup is only done to help developers. For our production instance we perform automated nightly backups as described above.
What is also not covered by the checked in project files is the basic configuration of the Ignition Gateway. Like the connections to the database and the MQTT broker. Even though this information is available in Kubernetes, we currently have to manually extract it and use the Ignition UI to perform the initial configuration.
Update incoming
The next major Ignition update with version 8.3 is supposed to bring new features, that make it easier to use source control for Ignition. We don’t really expect that this will change our current workflow completely. From what can be found about 8.3, all gateway configs will be moved to the filesystem, which makes it possible to version control the config in git. For the tags, we don’t expect any major changes, but we keep our fingers crossed.
Summary
From a developer perspective, working with Ignition in a GitOps setup is great. The one drawback that can be annoying, is handling changes in the tag system. But overall, the positive aspects make up for this. You get all the benefits of source control like tracking changes, handling collaboration and having all your things in order, in case anything goes south. In such cases, it is straight forward to fall back to a working solution or to get the complete system back up and running quickly.
From an operations and platform engineering perspective, Ignition still has a way to go to be Kubernetes-native and fully automated. But thanks to our solution everything that happens regularly (like updates and project deployments) has proper GitOps integration and is hassle-free. And we get normal Kubernetes automation, like restarting an Ignition instance if it should crash, or moving instances to another node if a complete node fails.
To sum it up: With our GitOps approach, development and testing are a breeze, operations are greatly simplified, and there are less headaches if something should break.