
AI/ML and HPC data-load acceleration
Hyperdisk ML is specifically optimized for accelerating data load times for inference, training and HPC workloads — Hyperdisk ML accelerates model load time by 3-5x compared to common alternatives4. Hyperdisk ML is particularly well-suited for serving tasks compared to other storage services on Google Cloud because it can concurrently provide to many VMs exceptionally high aggregate throughput (up to 1.2 TiB/s of aggregate throughput per volume, offering greater than 100x higher performance than competitive offerings)5. You write once (up to 64 TiB per disk) and attach multiple VM instances to the same volume in a read-only mode. With Hyperdisk ML you can accelerate data load times for your most expensive compute resources, like GPUs and TPUs. For more, check out g.co/cloud/storage-design-ai.
“At Resemble AI, we leverage our proprietary deep-learning models to generate high-quality AI audio through text-to-speech and speech-to-speech synthesis. By combining Google Cloud’s A3 VMs with NVIDIA H100 GPUs and Hyperdisk ML, we’ve achieved significant improvements in our training workflows. Hyperdisk ML has drastically improved our data loader performance, enabling 2x faster epoch cycles compared to similar solutions. This acceleration has empowered our engineering team to experiment more freely, train at scale, and accelerate the path from prototype to production.” – Zohaib Ahmed, CEO, Resemble AI
“Abridge AI is revolutionizing clinical documentation by leveraging generative AI to summarize patient-clinician conversations in real time. By adopting Hyperdisk ML, we’ve accelerated model loading speeds by up to 76% and reduced pod initialization times.” – Taruj Goyal, Software Engineer, Abridge
High-capacity analytics workloads:
For large-scale data analytics workloads like Hadoop and Kafka, which are less sensitive to disk latency fluctuations, Hyperdisk Throughput provides a cost-effective solution with high throughput. Its low cost per GiB and configurable throughput are ideal for processing large volumes of data with low TCO.
How to size and set up your Hyperdisk
To select and size the right Hyperdisk volume types for your workload, answer a few key questions:
-
Storage management. Decide if you want to manage the block storage for your workloads in a pool or individually. If your workload will have more than 10 TiB of capacity in a single project and zone, you should consider using Hyperdisk Storage Pools to lower your TCO and simplify planning. Note that Storage Pools do not affect disk performance; some data protection features such as Replication and High Availability are not supported in Storage Pools.
-
Latency. If your workload requires SSD-like latency (i.e., sub-millisecond), it likely should be served by Hyperdisk Balanced or Hyperdisk Extreme.
-
IOPS or throughput. If your application requires less than 160K IOPS or 2.4 GiB/s of throughput from a single volume, Hyperdisk Balanced is a great fit. If it needs more than that, consider Hyperdisk Extreme.
-
Sizing performance and capacity. Hyperdisk offers independently configurable capacity and performance, allowing you to pay for just the resources you need. You can leverage this capability to lower your TCO by understanding how much capacity your workload needs (i.e., how much data, in GiB or TiB, is stored on the disks which serve this workload) and the peak IOPS and throughput of the disks. If the workload is already running on Google Cloud, you can see many of these metrics in your console under “Metrics Explorer.”
Another important consideration is the level of business continuity and data protection required for your workloads. Different workloads have different Recovery Point Objective (RPO) and Recovery Time Objective (RTO) requirements, each with different costs. Think about your workload tiers when making data-protection decisions. The more critical an application or workload, the lower the tolerance for data loss and downtime. Applications critical to business operations likely require zero RPO and RTO in the order of seconds. Hyperdisk business continuity and data protection helps customers meet the performance, capacity, cost efficiency, and resilience requirements they demand, and helps them address their financial regulatory needs globally.
Here are a few questions to consider when selecting which variety of Hyperdisk to use for a workload:
-
How do I protect my workloads from attack and malicious insiders? Use Google Cloud Backup vault for cyber resilience, backup immutability, and indelibility for managed backup reporting and compliance. If you want to self-manage your own backups, Hyperdisk standard snapshots are an option for your workloads.
-
How do I protect data from user errors and bad upgrades cost efficiently with low RPO / RTO? You can use our point-in-time recovery with Instant Snapshots. This feature minimizes the risk of data loss from user error and bad upgrades with ultra-low RPO and RTO — creating a checkpoint is nearly instantaneous.
-
How do I easily deploy my critical workload (e.g., MySQL) with resilience across multiple locations? You can utilize Hyperdisk HA. This is a great fit for scenarios that require high availability and fast failover, such as SQL Server that leverages failover clustering. For such workloads, you can also choose our new capability with Hyperdisk Balanced High Availability with Multi-Writer support. This allows you to run clustered compute with workload-optimized storage in two zones with RPO=0 synchronous replication.
-
When a disaster occurs, how do I recover my workload elsewhere quickly and reliably, and run drills to confirm my recovery process? Utilize our disaster recovery capabilities with Hyperdisk Async Replication which enables cross-region continuous replication and recovery from a regional failure, with fast validation support for disaster recovery drills via cloning. Further, consistency group policies help ensure that workload data that’s distributed across multiple disks is recoverable when a workload needs to fail over between regions.
In short, Hyperdisk provides a wealth of options to help you optimize your block storage to the needs of your workloads. Further, selecting the right Hyperdisk and leveraging features such as Storage Pools can help you lower your TCO and simplify management. To learn more, please visit our website. For tailored recommendations, always consult your Google Cloud account team.
1. As of March 2025 based on published information for Amazon EBS, Azure managed disks.
2. As of May 2025, compared to Amazon EBS gp3 volumes max iops/volume
3. As of March 2025, at list price, 50 to 150 TiB, peak IOPS of 25K to 75K and 25% compressibility, compared to Amazon EBS gp3 volumes.
4. As of March 2025, based on internal Google benchmarking, compared to Rapid Storage, GCSFuse with Anywhere Cache, Parallelstore and Lustre for larger node sizes.
5. As of March 2025 based on published performance for Microsoft Azure Ultra SSD and Amazon EBS io2 BlockExpress
The authors would like to thank David Seidman and Ruwen Hess for their contributions on this blog.
Source Credit: https://cloud.google.com/blog/products/storage-data-transfer/how-to-choose-the-right-hyperdisk-block-storage-for-your-use-case/