
As a customer of Google Kubernetes Engine (GKE), you’ve selected a container runtime with a high degree of managed operations, encompassing everything from automatic upgrades to effortless node management. This inherent efficiency allows you to focus more on your applications, and less on the underlying infrastructure. In an ideal world, this streamlined experience, coupled with GKE’s robust autoscaling capabilities, ensures perfect workload scheduling all the time. Your applications seamlessly scale up and down, always finding the resources they need, precisely when they need them.
Unfortunately though, the real world presents a few more challenges that need to be addressed. GKE offers powerful four-way autoscaling (Horizontal Pod Autoscaler, Vertical Pod Autoscaler, Cluster Autoscaler, and Node Auto Provisioning) that provides the building blocks to address the scalability needs for workloads and infrastructure. However, running an efficient platform for today’s dynamic workloads involves more than just ensuring scalability. Factors like cost optimization, capacity availability, the speed at which resources can scale, overall performance, and the flexibility of your infrastructure all profoundly affect and constrain how workload scheduling can be effectively planned on GKE. Honestly, it can get a bit cloudy on what is the best strategy and what are the trade offs between these parameters.
In this blog we will focus on specifically the GKE scheduler and the factors that can influence its workload placement decisions when capacity constraints exist. We will explore how to plan and design for these scenarios using various GKE features and workload configurations.
Source Credit: https://cloud.google.com/blog/products/containers-kubernetes/gke-features-to-optimize-resource-allocation/