New Dataflow features to enable streaming and ML workloads

The world of artificial intelligence is moving at lightning speed. At Google Cloud, we’re committed to providing best-in-class infrastructure to power your AI and ML workloads. Dataflow is a critical component of Google Cloud’s AI stack that lets you create batch and streaming pipelines that support a variety of analytics and AI use cases. We’re excited to share a wave of recent features and capabilities that give you more choice, greater obtainability, and improved efficiency when it comes to running your batch and streaming ML workloads.

More choice: Performance-optimized hardware

We understand that all ML workloads are not created equal. That’s why we’re expanding our hardware offerings to give you the flexibility to choose the best accelerators for your specific needs.

New GPUs: We’re constantly adding the latest and greatest GPUs to our lineup, and we recently announced support for H100 and H100 Mega GPUs. This means you can take advantage of cutting-edge hardware to accelerate your AI inference workloads. Leading businesses are leveraging GPUs in Dataflow to power innovative customer experiences — from threat intelligence platform provider Flashpoint powering document translation, to media provider Spotify enabling at-scale podcast previews.
TPUs: For large-scale ML tasks, Tensor Processing Units (TPUs) offer a powerful and cost-effective solution. We recently announced support for TPU V5E, V5P and V6E, which enable state-of-the-art ML builders to efficiently run high-volume, low-latency machine learning inference workloads at scale, directly within their Dataflow jobs.

Greater accelerator obtainability

Getting access to the hardware you need, when you need it, is crucial for keeping your ML projects on track. We’ve introduced new ways to consume accelerators that make it easier than ever to get the resources you need.

GPU/TPU reservations: You can now reserve GPUs and TPUs for your Dataflow jobs, so that you’ll have the resources you need when you need them. This is important for critical workloads that can’t afford to wait for resources to become available.
Flex-start GPU provisioning: For batch jobs with flexible start times, securing GPUs can be a manual and uncertain process due to high industry-wide demand. Our new flex-start provisioning model enabled by Dynamic Workload Scheduler (DWS) effectively addresses this issue: Instead of a job failing when accelerator resources are unavailable, Dataflow now queues your job and automatically starts it as soon as the required GPUs become available. This eliminates the need for repeated manual resubmissions, mitigating stockout risk and accelerating developer productivity.

Source Credit: https://cloud.google.com/blog/products/data-analytics/new-dataflow-features-to-enable-streaming-and-ml-workloads/