Kubetalk: Building a DevOps AI Assistant with Gemini, ADK & Vertex AI | by Shrey Batham | Google Cloud - Community

What if I told you that you can interact with Kubernetes using natural language — no need to learn YAML or struggle with kubectl commands?

That’s where our KubeTalk Agent comes into play. Powered by Google Cloud, KubeTalk bridges the gap between human intent and technical execution — making Kubernetes accessible, faster, and a lot less painful.

Today, we’re getting hands-on. We’ll build a real-world AI agent from scratch and deploy it to Google Cloud Agent , bringing the concept to life through a powerful Kubernetes use case.

KubeTalk is an innovative AI agent designed to revolutionize Kubernetes operations — specifically on Google Kubernetes Engine (GKE) MCP servers.

It allows everyone from expert DevOps engineers to platform beginners to manage Kubernetes clusters using natural language commands like:

“Deploy an Nginx web server”
“Scale my ‘backend’ service to 5 instances”

KubeTalk takes your plain English instructions and translates them into valid Kubernetes configurations and kubectl commands—no YAML, no CLI struggle.

Accelerates development cycles
Reduces human error
Democratizes Kubernetes usage
Acts as a natural language interface for cloud-native tasks

KubeTalk gives you a conversational interface to a more agile, DevOps-friendly future.

Powered by a Custom MCP Server

To enable real infrastructure control, our agent integrates with a custom-built GKE MCP Server, which streams Kubernetes cluster events and responses over SSE.

In the next blog, I’ll walk you through how I built this lightweight MCP server from scratch using FastAPI + Kubernetes client libraries, and how it acts as a bridge between LLMs and your infra.

➡️ Read the full MCP server implementation here: https://github.com/bathas2021/gke-mcp.git

Let’s get right into it!

Prerequisite

Before we begin, make sure you have:

✅ A Google Cloud Platform (GCP) project
✅ Python 3.8+ installed
✅ Access to Vertex AI, and Cloud Run enabled in your GCP project
✅ (Optional) Basic understanding of Kubernetes (GKE)

Building an Agent

To bring our AI agent to life, we use Google’s adk (Agent Development Kit), which simplifies building intelligent, tool-augmented agents. Below is the implementation of our kubetalk agent—a DevOps-flavored LLM-powered assistant that can interact with a Kubernetes Model Context protocol (MCP) server in real time.

from google.adk.agents import LlmAgent
from google.adk.tools.mcp_tool.mcp_toolset import MCPToolset, SseConnectionParams

We start by importing the core LlmAgent class and the MCPToolset, which equips our agent with the ability to send and receive instructions from the Kubernetes backend via Server-Sent Events (SSE).

def create_root_agent():
return LlmAgent(
model='gemini-2.0-flash',
name='kubetalk',
instruction="""
You are an AI DevOps infrastructure agent...      
""",

Here, we’re instantiating an LLM agent named kubetalk using the lightweight and fast gemini-2.0-flash model. This model ensures quick responses, ideal for real-time infrastructure tasks.

The instruction block defines the agent’s persona and context—guiding its behavior as an expert DevOps assistant. This is where you’d describe what the agent knows, how it should respond, and its domain boundaries.

        tools=[
MCPToolset(
connection_params=SseConnectionParams(
url='http://X.X.X.X:X/sse',
headers={'Accept': 'text/event-stream'}
)
)
]

To allow our agent to interact with Kubernetes operations, we attach a tool: the MCPToolset. This enables the agent to connect to the MCP server using SSE, allowing it to receive streaming updates (like pod status, deployment changes, etc.).

Alternative transport: For simpler use cases or local testing, the ADK also supports stdio (standard input/output) transport, which can be useful when integrating the agent with shell scripts or CLI tools.

root_agent = create_root_agent()

Finally, we instantiate the root agent. This single line brings together the model, instructions, and tools into a fully operational AI DevOps assistant, ready to assist with infrastructure tasks via natural language.

Full Source Code

You can find the complete implementation, including the MCP server and deployment scripts, in the GitHub repository:

👉 https://github.com/bathas2021/adk-agents

Running the Agent Locally with `adk web`

Before deploying to the cloud, it’s helpful to test the agent locally to ensure everything works as expected. The Agent Development Kit (ADK) provides a built-in CLI command called adk web, which launches a lightweight web-based chat interface to interact with your agent locally.

Start the local agent server using:

adk web

Make sure you’re out the directory where kubetalk folder is located.

What’s Happening Behind the Scenes?

When you run adk web, it:

Initializes your agent locally using the Gemini model
Sets up a web UI with a text input + streaming output
Allows you to test agent-tool interaction (like MCP calls) live

Once our agent logic is implemented and working locally, the next step is to deploy it to the cloud so it can serve real users or automation pipelines. We’ll use Vertex AI Agent Engines, which is Google Cloud’s managed platform for hosting and scaling LLM-based agents.

Set Up Project and Location

Before interacting with Vertex services, we need to configure our GCP environment:

import vertexaiGOOGLE_CLOUD_PROJECT = "XXXXXX"
GOOGLE_CLOUD_LOCATION = "us-central1"
STAGING_BUCKET = "gs://XXXXXX"
vertexai.init(
project=GOOGLE_CLOUD_PROJECT,
location=GOOGLE_CLOUD_LOCATION,
staging_bucket=STAGING_BUCKET
)

Wrap Your Agent with `AdkApp`

from vertexai.preview.reasoning_engines import AdkApp
from kubetalk.agent import root_agent
app = AdkApp(agent=root_agent, enable_tracing=False)

Deploy the Agent to Vertex AI

Now we deploy the agent using agent_engines.create():

from vertexai import agent_enginesremote_app = agent_engines.create(
agent_engine=app,
requirements=[
"google-cloud-aiplatform[adk,agent_engines]",
],
extra_packages=["./kubetalk"],
# Optional: display_name="kubetalk-agent"
)

The deployment takes about 15–20 minutes — perfect time to grab a cup of coffee.

Behind the scenes, this call:

Packages your agent code and dependencies.
Uploads everything to your staging GCS bucket.
Creates a hosted agent endpoint on Vertex AI.
Returns a remote_app object you can now call.

Once deployed, your agent is:

Hosted in a managed, scalable environment.
Ready to receive user input via APIs or UI widgets.
Able to communicate with your custom MCP backend in real time.

You can now integrate the agent into UIs, chatbots, CLI tools, or even trigger it via GitHub Actions.

Once the agent is successfully deployed to Vertex AI Agent Engines, Google provides two RESTful endpoints to interact with it:

Synchronous Query Endpoint: https://us-central1-aiplatform.googleapis.com/v1/projects/XXXXX/locations/us-central1/reasoningEngines/XXXXXX:query

Purpose: Traditional request/response interactions

2. Streaming (SSE): https://us-central1-aiplatform.googleapis.com/v1/projects/XXXX/locations/us-central1/reasoningEngines/XXXXXX:streamQuery?alt=sse

Purpose: Stream partial responses as the model thinks

Stay tuned. The agent era is just beginning.

Happy Exploring !

Source Credit: https://medium.com/google-cloud/talk-to-kubernetes-build-an-ai-agent-that-understands-natural-language-powered-by-google-cloud-a03f061e3e7a?source=rss—-e52cf94d98af—4

Deven Goratela

Administrator

Visit Website View All Posts

Related Stories

Accelerate Medical Discovery with PubMed in BigQuery

Introducing checkpointless and elastic training on Amazon SageMaker HyperPod

Docker Gamefied: Learning Containers in the Age of AI Agents

You may have missed