Member-only story
Understanding Vertical Pod Autoscaling in Kubernetes
And why you should not use it in auto mode

Vertical Pod Autoscaling is one of those cool Kubernetes features that are not used enough — and for good reason. Kubernetes was built for horizontal scaling and, at least initially, it didn’t seem a great idea to scale a pod vertically. Instead, it made more sense to create a copy of the Pod if you want to handle the additional load.
However, that required extensive resource optimisation, and if you didn’t tune your Pod appropriately, by providing a proper resource request and limits configuration, you may either end up evicting your pods too often or wasting many useful resources. If you assign too little request resources to your Pod, your Pod may not have enough resources to startup, and if you give too much, it will end up wasting useful resources that might be useful to other pods.
The idea of using Kubernetes is to pack as many containers as possible in the least infrastructure (obviously this also requires you to keep a buffer to tackle a node failure, but you get the point). Developers and system administrators struggled to find the optimum value of resource requests and limits. Tuning them required a fair amount of monitoring and understanding the utilisation of both, through a benchmark testing or through general…