The bill can quickly increase manyfold in the cloud if we don't pay attention and act to prevent and reduce it.
Here I will list a few potential saving opportunities when running Kubernetes in the cloud...
Scale down to zero
Basically anything that does not need to run all the time should not run all the time. I know, it's genius.
Examples of this are development and staging workloads which can be scaled down daily and scaled back up on demand.
Another example is asynchronous workloads that do work based on messages in a message queue. These can be scaled down to zero if no messages are available in the queue and scaled back up again if new work needs to be done.
Workload Resources Monitoring and Adjusting
We know that it's important to assign resource requests and limits to our Kubernetes deployments/pods but over time these change and if we are lucky they go down.
This is a nice chance for us to adjust our requests.
Of course, before you can adjust some resources you need to know what the utilization is i.e. you need to have appropriate monitoring.
Spot Instances (AWS)
There are three types of instances in AWS ordered by cost descending:
- Reserved and
- Spot. (up to 90% cheaper than on-demand)
So why is anybody using on-demand or reserved and not spot instances all the time?
Well for one, they are not available all the time. They come and go that is why it is important to define diversified pool of instance types to potentially always have some spot instances available.
The other caveat with spot instances is that the workloads running on them need to be suited i.e. be able to withstand potential interruptions when spot instances are taken back by AWS.
At last, here is a list of tools that might help you with the implementation of the ideas mentioned:
Kube-Downscaler can scale down workloads during configured hours e.g. non-work hours and days. But there are also other tools and ideas regarding time based scaling in Kubernetes so you can reasearch it and maybe find a solutions that fits your needs better.
Keda is a event-based Kubernetes autoscaler which, among other things, can scale down deployments to zero based on external metrics. This is relevant if and until Kubernetes HPA will be able to do this on its own.
AWS Karpenter is a node-provisioning software that helps us increse efficiancy and reduce costs of workloads. Among other things, it can utilize spot instances for our workloads.