Do you need Kubernetes?

Check the Reddit or HackerNews comments for any random article about Kubernetes and you will likely see strong opinions about Kubernetes being way too complex for whatever scenario or use case is described. Some will tell you to just rent your own server (mostly called VPS, Virtual Private Server), others will tell you Serverless is the solution to everything. In this post I want to take a closer look if and when Kubernetes is the right solution and when going Serverless is the right choice.

Run your own server

The classic approach is to run your own Linux server and host your workloads there. Back then the LAMP stack (Linux, Apache, MySQL, PHP) was more or less standard. It is still widely used, for example by sites running WordPress, which according to statistics is still used on more than half of all websites.

Nowadays, running your own server is very easy. You can get a Linux box from an endless number of hosters, Hetzner being one of the biggest and most often mentioned, at least in Europe, because it has a wide offering and cheap prices. Provisioning a new server takes just a few clicks, and there are even templates to pre-install software like WordPress.

Such a setup is quickly done, but it is in no way anything near a state to run a serious business from it. You need a setup that is production-ready, meaning that you can safely and securely host workloads for your company and earn money with it. If your site goes down, you lose money and get angry customers. If a hacker steals your customers data, you are facing real problems and probably lawsuits. If you can't recover from a disaster quickly (maybe a hacker deletes your data, maybe you delete your server yourself accidentally), you risk bankruptcy.

For me being production-ready means you must

  • automate your setup (I call that "GitOps instead of ClickOps")
  • make regular backups and have a tested restore procedure
  • secure your server against hackers (firewall, authentication, etc.)
  • update components regularly to get security patches

So you start building up your automation using any number of tools. The most prominent for infrastructure automation is Terraform, and Ansible for configuration. If you do it properly, you automate everything from provisioning your server, configuring an empty base system to deploying your applications. So that you can provision a new server from scratch with just two tool runs. Depending on what and how many applications you run on your server, you might also consider running them in Containers to simplify installation and dependency management.

I ran such a setup for a number of years for my HomeLab. A single custom-built NAS server stuffed in a corner of my apartment running initially Debian and later Arch Linux serving both file storage and applications. The applications were all packaged into Docker containers. Provisioning and deployment I did using Ansible with a collection of custom-written roles. That setup served me well, but became increasingly hard to orchestrate with the rising number of containers and their interactions. This complexity was one of the reasons why I finally migrated over to a Kubernetes-based setup.

Such a setup works well if you can fit your entire solution into one server. Or maybe a few of them, if your setup is simple and static enough to make manual multi-server orchestration feasible. By default you have no high availability, if your server goes down, your applications are down. You can add HA yourself, but it will be a complex endeavor and will likely require external components like Loadbalancers or network storage that only few hosting providers offer.

Yes, using a simple server you can host applications very cheaply and if it's just for you or you don't require great uptime or redundancy, you can get surprisingly fast with a cheap and simple setup. But realistically speaking, anything seriously commercial will need more.

The DHH way

Over the last few years several tools have been developed and become popular that make deploying Docker-based applications to Linux servers easier. Top among them is Kamal from Ruby on Rails creator David Heinemeier Hansson (DHH). Kamal is a CLI deployment tool that takes care of setting up the necessary services on an empty Linux box and then builds and deploys applications. It is aimed at classic backend applications with optional databases. Kamal was built by DHH for use at his company 37signals during their cloud exit.

In comparison to building your own automation, Kamal certainly reduces effort and complexity considerably. But this also comes with a loss of flexibility. If your use case or what you want to do does not conform to what DHH and co envision for their tool, you will have a hard time. And for a production-ready setup you still need to deal with things like Loadbalancing / High availability, storage, backups, monitoring, and others.

There have also been reports that Kamal does not configure Linux hosts in a secure manner. It also does not take care of system management or updates. So you still need to handle all these topics.

If Kamal is not to your liking, Dokku is recommended by many as a simple Platform-as-a-Service solution, but I have not looked at it in detail. And in the end it suffers from the same limitations as Kamal: You still need to provision and manage your own servers. And deal with high availability and scaling.

Kubernetes

Way up the complexity ladder from simple servers sits Kubernetes. It is the go-to solution if you have to orchestrate a large number of applications or if your workloads are very dynamic. The downside of complexity (both in setting up and operating Kubernetes and in configuring and deploying workloads) comes with an upside of flexibility and possibilities for all aspects of application deployment and operations. These range from security (Role-based access control and network isolation) over routing (Ingress and Service Meshes) and autoscaling (both vertical and horizontal for workloads and nodes) to automations and abstractions via Kubernetes Operators.

The distributions and hosting options for Kubernetes are similarly vast. It starts small and simple with single-node clusters with K3s or K0s that can run on computers as small as a Raspberry Pi. Although both tools can also easily handle multi-node deployments. For bigger needs you have the self-hosted enterprise distributions (e.g. Rancher or Nutanix NKP). Or if you want someone else to run Kubernetes for you, you can go with managed installations from the big cloud providers (Azure AKS, AWS EKS, Google GKE) or from external companies (Giantswarm, which also support on-premise clusters). Cloud providers also offer options like EKS Auto Mode or EKS Fargate to avoid having to manage nodes.

Most often you will choose a managed cloud provider offering. Even if they cost more, you will save on complexity and person-hours for managing such a setup. And in the cloud you get full flexibility to scale your clusters based on your needs (fully automatically increasing capacity within a matter of minutes is nothing special nowadays).

But for complex on-premise setups as they are often part of Smart Factory projects I build and manage as part of my job at MaibornWolff, getting managed Kubernetes from Giantswarm can be a life-saver. Cloud-based setups don't work if you need to connect to production equipment in factories that needs to be isolated from a network perspective, and most manufacturing companies don't have an excess of Kubernetes engineers to run their clusters. So getting a competently managed installation removes a large hurdle.

But Kubernetes does not have to be big. Single-node setups are equally viable to get the same deployment and management mechanisms (e.g. GitOps) as the big setups but for limited hardware. K3s is my go-to solution there. And if your compute resources are more constrained, KubeSolo could also be an option.

For my own private HomeLab setup I'm using a single-node K3s cluster that runs all my applications (both open-source software like Nextcloud or Bitwarden/Vaultwarden and my own custom solutions). Of course, I wouldn't be a good platform engineer if I hadn't automated the heck out of it. Everything is deployed GitOps-style with FluxCD. For monitoring I use a pretty default Prometheus, Loki and Grafana stack. And of course I have automated backups which I test regularly by restoring a subset of my workloads from scratch on a new temporary server.

Granted, that setup is totally overkill. But it allows me to play with the same technologies and concepts I deal with at work, without any constraints. It helps me learn, and gives me a way to test new concepts without much effort or risk. And, at least for me, it is a lot of fun.

Kubernetes as an abstraction

Even if you don't need the complexity of Kubernetes yet, you might still choose it for another reason: Abstraction. Kubernetes provides a generic and open interface to deploy and run applications, regardless of your infrastructure. So you can start your platform on one cloud provider, and later on move to a different one, or even to a self-hosted solution. You will probably still need to change some parts that interact with outside components (e.g. ingress and cloud APIs), but for the most part application deployments that worked in one Kubernetes environment, will still work in a different one. This makes you more independent and reduces vendor lock-in, allowing you to more easily and cheaply move providers. Whereas Serverless, which I discuss in the next section, is often proprietary and would require major rework to switch solutions.

Serverless

Completely on the opposite side of Kubernetes on the complexity scale sit Serverless solutions. As the name implies, these solutions do not require you to manage your own servers (which even managed Kubernetes offerings require to a certain extent). I distinguish two broad categories when thinking about Serverless:

  • Container services: These offerings run your container workloads, often with automatic horizontal scaling capabilities and managed services like databases. Prominent solutions are Railway, Render, Azure Container Apps, Fly.io and others
  • Functions-as-a-service: They run your code to serve web requests and deal with scaling automatically. In contrast to Container services they (at least conceptually) setup a new instance for each request so they must be treated as completely stateless and lightweight. They might use containers or higher-level abstractions under the hood. Prominent solutions are AWS Lambda, Azure Functions, Vercel, CloudFlare Workers and others

EKS Fargate also calls itself a serverless solution, but I see it more as an abstracted-away Kubernetes setup instead of a true Serverless offering. You still need to deal with most of the Kubernetes complexity instead of thinking just in terms of your functionality.

Serverless solutions (especially those from smaller companies and not from the big cloud providers) are very big on reducing complexity. You have minimal or no infrastructure to manage. Deployments can be as easy as giving the service access to your GitHub source code repository and it will automatically build and deploy your application (for example Railway Quick Start). Databases and other components, and often even third-party offerings, can be easily integrated (e.g. external application monitoring solutions like Sentry).

Serverless is generally billed based on actual consumption. With Kubernetes clusters like AKS and EKS you pay for the capacity of your nodes, regardless of if they are fully utilized or nearly idle. In contrast, with solutions like Lambda you pay only for the time a function invocation runs. This request-based billing means you can start very cheaply without large commitments. If your service is not used, it doesn't cost anything. And CloudFlare and others have started a trend of only billing for actual CPU time use instead of function runtime. So if your function is sitting idle waiting for a response from something (like a database or an external API) you will only be billed for the few milliseconds of actual compute. Whereas in the past it would have been the potentially several seconds of function runtime to complete a request. That billing change makes the whole ordeal even cheaper.

Serverless is also excellent at scaling. You don't need to provision instances to handle your load. If you have no requests, nothing needs to run. And if you experience an onslaught of requests because your service has suddenly become popular, you can rest easy, the provider will spin up as many function instances as needed on-the-fly.

All this makes Serverless solutions a great offering for startups or for teams with dynamic and shifting workloads that mostly deal with web applications and request-based handling in publicly available applications. On the downside you have less control, but you have also way less to take care of. Especially for small or dynamic systems, or if you don't have the manpower to deal with complex infrastructure it is often the only practical way to get started.

Conclusion

So what is the right solution?

If you are developing public web/api applications and don't have special backend requirements (or can find managed services for them), going with a Serverless architecture and provider is the right choice. Quick and cheap to get started, and you can easily deal with sudden usage spikes.

For Serverless you pay a premium for the abstractions providers offer. So only looking at raw compute costs (CPU, Memory), running your own servers would mostly be cheaper. But if you factor in personnel and scalability costs, Serverless options will be cheaper for a long time. If you have a simple architecture under a high but consistent and predictable load and infrastructure competence in your team, running your own servers can be the cheaper option. This could mean buying servers and colocation space or just renting VMs or bare metal hosts from a provider. The tipping point where Serverless becomes more expensive is different for each setup and team. Rule of thumb: If you don't know how to calculate that point, you have not yet reached it.

Kubernetes is the better option if you have a complex architecture that you maybe run in an on-premise or hybrid on-premise/cloud environment, or if you need fine control over your setup. For running "just" a few microservices or web applications, Kubernetes is totally overkill. But complex IoT and Smart factory architectures like I design and build in my job at MaibornWolff will often require Kubernetes. Such architectures include not only "normal" microservices, but also machine connectivity solutions, message brokers (MQTT and Kafka), data processing and analysis pipelines as well as big data and AI workloads. Many of these you can build from managed services offered by cloud providers. But as soon as you need to go on-premise into the factory or deviate from given services, you will write your own services and orchestrate many components. And that is where Kubernetes shines.

For simple setups and for most companies developing "normal" web/api-based applications, Serverless and other managed services will be the better option. Choose Kubernetes only if your environment or complexity requires it.

I've only concentrated on a limited view in this post. There are many more options and ways to run your setups and workloads, and as many arguments for or against the different options. When comparing, keep not only the directly visible price of services/offerings in mind, but also the cost of manpower and complexity if you build stuff yourself.