Forget the Agent Platform. The real story at Next ’26 was GKE

Skipped Google Next at Vegas this year? These are the announcements I think will keep landing in customer conversations over the next quarter.

Gemini Enterprise Agent Platform

The headline of the keynote, and probably rightly so. The Gemini Enterprise Agent Platform is a full stack to build, run, and govern agents end to end: Agent Studio (low-code, natural language), Agent Designer (no-code, trigger-based, inside the Gemini Enterprise app), Agent-to-Agent Orchestration, and the governance plane around it (Registry, Identity, Gateway, Observability). There’s also an Agent Inbox so a human can actually see what their agents are doing, and long-running agents that operate autonomously inside secure cloud sandboxes.

If you’ve used Vertex AI Agent Builder, this is what it grows into when you add real enterprise governance. The Agent Gateway partner list (Cisco, Palo Alto, CrowdStrike, Okta, Ping, Zscaler, F5, and more) is the part that gets it past most CISOs.

TPU 8t and TPU 8i

Google split the eighth generation into two: a training chip (8t) and an inference chip (8i). The 8i claims 80% better performance per dollar on inference. That is the FinOps number to model in your TCO sheet, especially if you’re running a serving fleet.

A5X instances also bring NVIDIA Vera Rubin NVL72 to GCP later this year. So both bets are covered: in-house silicon and the latest NVIDIA.

Virgo Network

The new megascale data center fabric. High-radix switches, big bisection bandwidth, deterministic low latency. The point is keeping accelerators fed for distributed training and serving. Pair it with Managed Lustre at 10TB/s and Cloud Storage Rapid and the “I/O is bottlenecking my training” excuse becomes harder to defend.

GKE: this is where it got interesting for me

This was the part of the event that surprised me. The headline page on blog.google barely mentions GKE, but the actual breakouts were dense with announcements. Some context Google shared: 66% of organisations now run Kubernetes for generative AI workloads, multi-agent workflows are up 327% in a few months, and GKE powers AI for all of Google’s top 50 customers on the platform.

GKE Hypercluster (private GA). One conformant control plane managing up to 1 million chips across 256,000 nodes, spanning regions. The interesting bit is Titanium Intelligence Enclave, a hardware-rooted “no-admin-access” security engine where weights and prompts stay sealed even from platform admins. Clearly aimed at frontier builders, but the design choices are worth understanding even if you’re nowhere near that scale.

GKE Agent Sandbox. gVisor-isolated sandboxes for agents to run untrusted code and tool calls. Up to 300 sandboxes per second per cluster, sub-second time to first instruction. On Axion N4A it claims 30% better price-performance than the next leading hyperscaler. The only native agent sandbox among hyperscalers at the moment. I wrote about the N4A price-performance here.

Inference Gateway. Capacity-aware routing and KV Cache management to cut TTFT (Time to First Token). Less hand-tuning to land on a decent inference baseline.

RL primitives. RL Scheduler (kills the straggler effect between sampling, reward, and training steps), RL Sandbox (kernel-level isolation, millisecond provisioning), RL Observability and Reliability dashboards. Anyone who has tried to run reinforcement learning on Kubernetes knows how miserable that path was. This looks like Google taking it seriously.

Intent-based autoscaling on custom metrics. Scale on signals beyond CPU and memory. Honestly, should have been there years ago.

Ambient networking. Integrated data plane for GKE and Cloud Run. Service discovery, zero-trust, traffic management, no sidecars. If your inference latency or platform bill has been hurt by Istio sidecars, this is the one.

GKE continues to be positioned as the platform for AI workloads. Whether you buy “Kubernetes as the OS of the AI era” as a marketing line or not, the technical substance is doing most of the work.

Agentic Data Cloud

Knowledge Catalog (formerly Dataplex Universal Catalog, renamed in April). Gemini autonomously tags and links entities across your enterprise data so agents pick up your business context and naming conventions. Agent quality is mostly a context problem, and this is one of the better attacks on it I’ve seen.

Cross-Cloud Lakehouse (BigLake is now Google Cloud Lakehouse). Iceberg-standardised, Cross-Cloud Interconnect into the data plane, bi-directional federation in preview. Query data sitting in AWS without moving it. Spanner Omni and Lakehouse federation for AlloyDB are also in preview.

Worth flagging: Lightning Engine for Apache Spark at claimed 2x price-performance, BigQuery fluid scaling at up to 34% lower cost for autoscaling workloads, and an in-memory tier for Bigtable.

Agentic Defense (Google & Wiz)

Now that the Wiz acquisition has closed, the integration ships. Threat Hunting Agent (proactive hunting, writes detection rules autonomously), Detection Engineering Agent, Dark Web Intelligence (98% accuracy claim on millions of daily events), Third-Party Context Agent. Wiz coverage now extends to Databricks and more AI studios.

The framing Kurian used in the press session was AI red, blue, and green teams (attack, defend, fix) all running as agents. That mental model lands better than the product-by-product list.

Workspace Intelligence

Ask Gemini in Chat now synthesises across Docs, Drive, Meet, and Gmail and takes actions in place. Schedule a meeting, draft a brief, all from the chat surface. For end-user adoption inside enterprises, this is probably the most consequential announcement of the week.

What I’d actually do with this on Monday

If you’ve been waiting for a reason to consolidate fragmented GKE clusters, Hypercluster gives you one. Even if you don’t sign up for private GA, the design is the new reference.

If you’re running agent workloads outside GKE Agent Sandbox, run the price-performance numbers. The gVisor-on-Axion combination is hard to ignore at 30% better than the next hyperscaler.

If you’re doing RL training on GKE today, the scheduler and sandbox primitives change the cost picture meaningfully.

If you have BigQuery autoscaling workloads, audit them this quarter. 34% on average is not bad at all.

The one I’m most cautious about: a single GKE control plane managing a million chips across regions sounds wonderful until you think through blast radius and change management. Private GA is the right place for it. Let’s validate before we bet a pipeline on it.

Bonus: DESIGN.md

Not strictly a Next ’26 announcement, but Google Labs published this in the same window and it deserves a mention if you build anything agentic that touches a UI — who doesn’t these days?

DESIGN.md is a format spec for describing a visual identity to coding agents. YAML front matter for the machine-readable tokens (colors, typography, spacing, components), markdown prose underneath for the rationale. The agent gets exact values plus the why, all in a single file your designers and developers already know how to read and review.

It ships with a CLI for linting (broken token references, WCAG contrast checks, section ordering), diffing two versions to catch regressions, and exporting to Tailwind theme config or W3C DTCG tokens. Currently at alpha, Apache 2.0, already sitting at 8k+ stars.

If you’ve watched coding agents flail at “match our brand” instructions, this is your missing context layer.

Found this useful? Connect with me on LinkedIn for more insights, architecture, AI, and practical tech strategies. I regularly share ideas, lessons learned, and real-world solutions built from the front lines.

Forget the Agent Platform. The real story at Next ’26 was GKE was originally published in Google Cloud – Community on Medium, where people are continuing the conversation by highlighting and responding to this story.

Source Credit: https://medium.com/google-cloud/forget-the-agent-platform-the-real-story-at-next-26-was-gke-1c0d7fbfd142?source=rss—-e52cf94d98af—4