

What if I told you that you can interact with Kubernetes using natural language — no need to learn YAML or struggle with kubectl
commands?
That’s where our KubeTalk Agent comes into play. Powered by Google Cloud, KubeTalk bridges the gap between human intent and technical execution — making Kubernetes accessible, faster, and a lot less painful.
Today, we’re getting hands-on. We’ll build a real-world AI agent from scratch and deploy it to Google Cloud Agent , bringing the concept to life through a powerful Kubernetes use case.
KubeTalk is an innovative AI agent designed to revolutionize Kubernetes operations — specifically on Google Kubernetes Engine (GKE) MCP servers.
It allows everyone from expert DevOps engineers to platform beginners to manage Kubernetes clusters using natural language commands like:
- “Deploy an Nginx web server”
- “Scale my ‘backend’ service to 5 instances”
KubeTalk takes your plain English instructions and translates them into valid Kubernetes configurations and kubectl
commands—no YAML, no CLI struggle.
- Accelerates development cycles
- Reduces human error
- Democratizes Kubernetes usage
- Acts as a natural language interface for cloud-native tasks
KubeTalk gives you a conversational interface to a more agile, DevOps-friendly future.
Powered by a Custom MCP Server
To enable real infrastructure control, our agent integrates with a custom-built GKE MCP Server, which streams Kubernetes cluster events and responses over SSE.
In the next blog, I’ll walk you through how I built this lightweight MCP server from scratch using FastAPI + Kubernetes client libraries, and how it acts as a bridge between LLMs and your infra.
➡️ Read the full MCP server implementation here: https://github.com/bathas2021/gke-mcp.git
Let’s get right into it!
Prerequisite
- Before we begin, make sure you have:
- ✅ A Google Cloud Platform (GCP) project
- ✅ Python 3.8+ installed
- ✅ Access to Vertex AI, and Cloud Run enabled in your GCP project
- ✅ (Optional) Basic understanding of Kubernetes (GKE)
Building an Agent
To bring our AI agent to life, we use Google’s adk
(Agent Development Kit), which simplifies building intelligent, tool-augmented agents. Below is the implementation of our kubetalk
agent—a DevOps-flavored LLM-powered assistant that can interact with a Kubernetes Model Context protocol (MCP) server in real time.
from google.adk.agents import LlmAgent
from google.adk.tools.mcp_tool.mcp_toolset import MCPToolset, SseConnectionParams
We start by importing the core LlmAgent
class and the MCPToolset
, which equips our agent with the ability to send and receive instructions from the Kubernetes backend via Server-Sent Events (SSE).
def create_root_agent():
return LlmAgent(
model='gemini-2.0-flash',
name='kubetalk',
instruction="""
You are an AI DevOps infrastructure agent...
""",
Here, we’re instantiating an LLM agent named kubetalk using the lightweight and fast gemini-2.0-flash
model. This model ensures quick responses, ideal for real-time infrastructure tasks.
The instruction
block defines the agent’s persona and context—guiding its behavior as an expert DevOps assistant. This is where you’d describe what the agent knows, how it should respond, and its domain boundaries.
tools=[
MCPToolset(
connection_params=SseConnectionParams(
url='http://X.X.X.X:X/sse',
headers={'Accept': 'text/event-stream'}
)
)
]
To allow our agent to interact with Kubernetes operations, we attach a tool: the MCPToolset
. This enables the agent to connect to the MCP server using SSE, allowing it to receive streaming updates (like pod status, deployment changes, etc.).
Alternative transport: For simpler use cases or local testing, the ADK also supports
stdio
(standard input/output) transport, which can be useful when integrating the agent with shell scripts or CLI tools.
root_agent = create_root_agent()
Finally, we instantiate the root agent. This single line brings together the model, instructions, and tools into a fully operational AI DevOps assistant, ready to assist with infrastructure tasks via natural language.
Full Source Code
You can find the complete implementation, including the MCP server and deployment scripts, in the GitHub repository:
👉 https://github.com/bathas2021/adk-agents
Running the Agent Locally with adk web
Before deploying to the cloud, it’s helpful to test the agent locally to ensure everything works as expected. The Agent Development Kit (ADK) provides a built-in CLI command called adk web
, which launches a lightweight web-based chat interface to interact with your agent locally.
Start the local agent server using:
adk web
Make sure you’re out the directory where kubetalk folder is located.
What’s Happening Behind the Scenes?
When you run adk web
, it:
- Initializes your agent locally using the Gemini model
- Sets up a web UI with a text input + streaming output
- Allows you to test agent-tool interaction (like MCP calls) live
Once our agent logic is implemented and working locally, the next step is to deploy it to the cloud so it can serve real users or automation pipelines. We’ll use Vertex AI Agent Engines, which is Google Cloud’s managed platform for hosting and scaling LLM-based agents.
Set Up Project and Location
Before interacting with Vertex services, we need to configure our GCP environment:
import vertexaiGOOGLE_CLOUD_PROJECT = "XXXXXX"
GOOGLE_CLOUD_LOCATION = "us-central1"
STAGING_BUCKET = "gs://XXXXXX"
vertexai.init(
project=GOOGLE_CLOUD_PROJECT,
location=GOOGLE_CLOUD_LOCATION,
staging_bucket=STAGING_BUCKET
)
Wrap Your Agent with AdkApp
from vertexai.preview.reasoning_engines import AdkApp
from kubetalk.agent import root_agent
app = AdkApp(agent=root_agent, enable_tracing=False)
Deploy the Agent to Vertex AI
Now we deploy the agent using agent_engines.create()
:
from vertexai import agent_enginesremote_app = agent_engines.create(
agent_engine=app,
requirements=[
"google-cloud-aiplatform[adk,agent_engines]",
],
extra_packages=["./kubetalk"],
# Optional: display_name="kubetalk-agent"
)
The deployment takes about 15–20 minutes — perfect time to grab a cup of coffee.
Behind the scenes, this call:
- Packages your agent code and dependencies.
- Uploads everything to your staging GCS bucket.
- Creates a hosted agent endpoint on Vertex AI.
- Returns a
remote_app
object you can now call.
Once deployed, your agent is:
- Hosted in a managed, scalable environment.
- Ready to receive user input via APIs or UI widgets.
- Able to communicate with your custom MCP backend in real time.
You can now integrate the agent into UIs, chatbots, CLI tools, or even trigger it via GitHub Actions.
Once the agent is successfully deployed to Vertex AI Agent Engines, Google provides two RESTful endpoints to interact with it:
- Synchronous Query Endpoint: https://us-central1-aiplatform.googleapis.com/v1/projects/XXXXX/locations/us-central1/reasoningEngines/XXXXXX:query
- Purpose: Traditional request/response interactions
2. Streaming (SSE): https://us-central1-aiplatform.googleapis.com/v1/projects/XXXX/locations/us-central1/reasoningEngines/XXXXXX:streamQuery?alt=sse
- Purpose: Stream partial responses as the model thinks
Stay tuned. The agent era is just beginning.
Happy Exploring !
Source Credit: https://medium.com/google-cloud/talk-to-kubernetes-build-an-ai-agent-that-understands-natural-language-powered-by-google-cloud-a03f061e3e7a?source=rss—-e52cf94d98af—4