Adopt new VM series with GKE compute classes, Flexible CUDs

Organizations are consistently looking to gain an edge with the latest advancements in cloud computing. New Google Compute Engine and Google Kubernetes Engine (GKE) Gen4 machine series including N4, C4, C4A, C4D, to name a few, offer significant improvements in performance, cost-efficiency, and capabilities. However, migrating to new hardware isn’t always straightforward. Teams often face challenges with compatibility testing, regional capacity, and navigating financial commitments, all of which can slow down adoption.

The good news is that two powerful Google Cloud features, when used together, provide a strategic and cost-effective path to adopting a new machine series without the usual overhead. By combining the technical agility of GKE compute classes with the financial adaptability of Compute Flexible Committed Use Discounts (Flex CUDs), you can innovate faster, maintain resilience, and optimize costs — all at the same time. Even better, Compute Flex CUDs also allow discounted consumption of Autopilot and Cloud Run —making it easy to consume the right compute for your workload. Let’s dive in.

The challenge: Overcoming hardware adoption hurdles

While adopting the latest machine series unlocks new levels of performance and efficiency, organizations can face some challenges during the transition:

Compatibility testing: Before a full migration, teams need to validate that their applications perform as expected on a new machine series. This requires a strategy for safely introducing new hardware to gather performance data and ensure compatibility.
Navigating regional capacity: As new machine series expand to more regions, their availability can vary. This creates a need for a fallback option to ensure application availability isn’t impacted by capacity limitations in a specific location.
Aligning financial commitments: Resource-based CUDs provide excellent value but are tied to specific machine families and are less flexible for teams who want to adopt newer, more cost-performant hardware while still under an existing commitment term.
Migration of workloads: The process of configuring, migrating, and managing workloads across multiple machine types can be operationally complex. This requires significant coordination from platform teams to execute smoothly.

The solution, part 1: GKE compute classes

GKE compute classes provide an elegant technical solution to the challenges of hardware adoption. Instead of tying your workloads to a single machine type, you can define a prioritized list of machine families that GKE can use for autoscaling. This gives you a flexible and resilient way to incrementally integrate cutting-edge technologies.

With compute classes, you can define a policy that tells GKE to prioritize a new, cost-performant machine family (like N4) but automatically fall back to an established machine family (like N2 or N2D) you’re already using if the first choice isn’t available. Compute classes allow you to safely roll out new hardware in waves, by incrementally subscribing new workloads to the compute class. This helps to minimize operational risks and downtime.

How it works: An example

Let’s say you want to take advantage of the superior price-performance of the new N4 machine series for a stateless web application, but you want to fall back to the previous-generation N2 series for large, unexpected spikes in traffic

You can create a custom ComputeClass object with a prioritized list of machine families:

ComputeClass Manifest (n4-fallback-class.yaml):

Source Credit: https://cloud.google.com/blog/products/compute/adopt-new-vm-series-with-gke-compute-classes-flexible-cuds/