Introducing Flex-start VMs for the Compute Engine Instance API.

Innovating with AI requires accelerators such as GPUs that can be hard to come by in times of extreme demand. To address this challenge, we offer Dynamic Workload Scheduler (DWS), a service that optimizes access to compute resources when and where you need them. In July, we announced Calendar mode in DWS to provide short-term ML capacity without long-term commitments, and today, we are taking the next step: the general availability (GA) of Flex-start VMs.

Available through the Compute Engine instance API, gcloud CLI, and the Google Cloud console, Flex-start VMs provide a simple and direct way to create single VM instances that can wait for in-demand GPUs. This makes it easy to integrate this flexible consumption option into your existing workflows and schedulers.

What are Flex-start VMs?

Flex-start VMs, powered by Dynamic Workload Scheduler, introduce a highly differentiated consumption model that’s a first among major cloud providers, letting you create single VM instances that provide fair and improved access to GPUs. Flex-start VMs are ideal for defined-duration tasks such as AI model fine-tuning, batch inference, HPC, and research experiments that don’t need to start immediately. In exchange for being flexible with start time, you get two major benefits:

Dramatically improved resource obtainability: By allowing your capacity requests to persist in a queue for up to two hours, you increase the likelihood of securing resources, without needing to build your own retry logic.
Cost-effective pricing: Flex-start VM SKUs offer significant discounts compared to standard on-demand pricing, making cutting-edge accelerators more accessible.

Flex-start VMs can run uninterrupted for a maximum of seven days and consume preemptible quota.