The AI Hypercomputer advantage: Hardware and software co-designed for faster, more efficient outcomes
On top of this hardware is a co-designed software layer, where our goal is to maximize Ironwood’s massive processing power and memory, and make it easy to use throughout the AI lifecycle.
-
To improve fleet efficiency and operations, we’re excited to announce that TPU customers can now benefit from Cluster Director capabilities in Google Kubernetes Engine. This includes advanced maintenance and topology awareness for intelligent scheduling and highly resilient clusters.
-
For pre-training and post-training, we’re also sharing new enhancements to MaxText, a high-performance, open source LLM framework, to make it easier to implement the latest training and reinforcement learning optimization techniques, such as Supervised Fine-Tuning (SFT) and Generative Reinforcement Policy Optimization (GRPO).
-
For inference, we recently announced enhanced support for TPUs in vLLM, allowing developers to switch between GPUs and TPUs, or run both, with only a few minor configuration changes, and GKE Inference Gateway, which intelligently load balances across TPU servers to reduce time-to-first-token (TTFT) latency by up to 96% and serving costs by up to 30%.
Our software layer is what enables AI Hypercomputer’s high performance and reliability for training, tuning, and serving demanding AI workloads at scale. Thanks to deep integrations across the stack — from data-center-wide hardware optimizations to open software and managed services— Ironwood TPUs are our most powerful and energy-efficient TPUs to date. Learn more about our approach to hardware and software co-design here.
Axion: Redefining general-purpose compute
Building and serving modern applications requires both highly specialized accelerators and powerful, efficient general-purpose compute. This was our vision for Axion, our custom Arm Neoverse®-based CPUs, which we designed to deliver compelling performance, cost and energy efficiency for everyday workloads.
Today, we are expanding our Axion portfolio with:
- N4A (preview), our second general-purpose Axion VM, which is ideal for microservices, containerized applications, open-source databases, batch, data analytics, development environments, experimentation, data preparation and web serving jobs that make AI applications possible. Learn more about N4A here.
- C4A metal (in preview soon), our first Arm-based bare-metal instance, which provides dedicated physical servers for specialized workloads such Android development, automotive in-car systems, software with strict licensing requirements, scale test farms, or running complex simulations. Learn more about C4A metal here.
Source Credit: https://cloud.google.com/blog/products/compute/ironwood-tpus-and-new-axion-based-vms-for-your-ai-workloads/
