Finding Earth Engine Data Doesn’t Have to Be a Struggle: Let AI Agents Do the Heavy Lifting | by Eric Abelson | Google Cloud - Community

By Eric Abelson & Kristopher Overholt
(co-written with Kristopher Overholt, Developer Relations Engineer, Google)

Agentic AI personified as a humanioid working on a geospatial project on a computer. — Copyright 2025 by Eric Abelson & OpenAI

Google Earth Engine (GEE) is a powerhouse for working with remote sensing data, but if you’re not already familiar with its ecosystem, a simple task such as finding exactly the right dataset that you’re looking for can be overwhelming. What if, instead of scanning through collection pages and deciphering cryptic dataset descriptions, you could just describe what you’re trying to study and have AI agents guide you to the right data?

But what actually is an AI agent in this scenario? Is it a language model responding to simple instructions or is made up of a more complex system? Can it reason through challenging geoscience research tasks, can it use tools to talk to external data sources, or can it explore and learn insights from the data the way a researcher might?

The next few sections will walk through how we built an agentic system like this with ADK, including: how the agents are defined, how the sub-agents coordinate and work together, and how such agents can support researchers to help them find the perfect dataset for them (needle) within the mountain of map layers in Earth Engine (haystack).

Navigating the Earth Engine catalog requires domain expertise in the geosciences and the ability to extract meaning from metadata. It’s not just confusing for developers who are just starting with AI agents and generative AI; even long-time experts can spend a lot of time during the early stages of dataset exploration.

As a researcher, there were many times that I knew where to look for clues to find my next golden dataset, but I always encountered developer friction: I would find amazing spatial datasets with resolutions that I couldn’t use, I encountered naming schemes that felt like secret codes, and overall, I had the sense that I needed deeper specialization in satellite metadata just to get started exploring it.

Here, we designed a multi-agent system using the free and open-source Agent Development Kit (ADK) . This allowed us to describe the research goals in plain language via prompts and system instructions and get out a comprehensive response that includes a curated shortlist of data layers, plus a summary of their coverage, resolution, and usage considerations.

We designed our agentic system around three core research workflows (and we even left some ideas for future versions as well!): 1) interpreting the user’s research needs, 2) finding relevant datasets, and 3) summarizing those datasets in a way that highlights their utility and tradeoffs. We used the ADK to do the heavy lifting, including agent orchestration, tool use, and session management. And we started with the multi-agent team example in the ADK documentation and further customized the agent’s reasoning instructions and tools for our research tasks.

To power our agents, we gave them access to a few key tools — small, purpose-built Python functions for querying the Earth Engine catalog, filtering results, and pulling metadata fields. These tools are designed to mimic the kinds of steps a user would normally take in the GEE Code Editor, Client Libraries, or Catalog but instead helps them build their own agent in a structured and modular way, so that the agent can make us of these tools while we’re talking and exploring the data along with it.

search_gee_catalog(query)

This tool constructs a search URL to the Earth Engine dataset catalog, based on a user’s input query, and returns the search results to the agent. It allows the agent to simulate what a human might do when browsing the official GEE datasets search page.

google_search()

This google_search tool enables our agent to perform web searches using Google Search to augment information from the GEE data catalog and map layers. By the way, you can use any search tool that works with ADK, LangChain, or other agent frameworks!

fetch_webpage_text(url)

This helper tool fetches the full text content of a provided URL, of which the agent can use to understand key characteristics such as spatial resolution, date ranges, or intended use cases.

Each agent was instantiated using the ADK’s Agent class, with a clear and descriptive name, scope, instructional prompt, and access to the tools that it needs. This allows the root agent to assign tasks intelligently and stay focused on its job. And more importantly, it helps the root agent understand 1) which tools it has available, 2) how it should use a given tool, and 3) an overarching goal that it can use to steer the team of agents towards a common goal.

root_agent

The root agent acts as the system’s coordinator. It interprets the user’s query, delegates tasks to the search and web agents, and brings the pieces back together into a structured output that we defined as researchers.

gee_search_agent

This agent is in charge of exploring the Earth Engine data catalog based on the researcher’s needs. It passes search queries to the search_gee_catalog tool and helps researchers discover and assess candidate datasets.

web_search_agent

This agent conducts a broader web search using the google_search tool to find related context, existing research, and metadata. This is especially helpful when researchers are trying to evaluate if a layer will meet their research needs.

web_fetch_agent

This agent retrieves the actual content for specific dataset pages or other web resources using the fetch_webpage_text tool. It hands that content off to other agents (or back to the root agent) for further summarization. It’s designed to retrieve text from any URL without assumptions about the web page’s syntax, structure, frontend frameworks, or other rendering details.

Once the system is up and running, users can just explain their research problem and goals such as, “What are the monthly drought indices for Africa?” or “Help me find layers that can be used to understand how human generated light at night influences nocturnal animal movement behavior”, and the agent researcher team goes to work, helping us at each step along the way.

You can see each agent working together in action, querying the GEE catalog, pulling candidates, ranking relevance, and producing short summaries that include the dataset ID, spatial resolution, time range, refresh rate, and anomalies to know about (e.g. band names, scale, processing level).

We didn’t have to write the orchestration logic, agent handoffs, or history tracking. ADK handled that. All we focused on was defining agent roles, specifying tool capabilities, and refining prompt instructions. The whole system came together in a couple hours of work, including testing, and the final implementation is about 200 lines of code.

You can easily recreate the GEE agent that we built in your own development environment by:

Cloning the GitHub repo using git clone https://github.com/ericabelson/agentic-gee-assistant/
Installing Python and dependencies using pip install -r requirements.txt
Adding your Gemini API key (or any LLM of your choice!) to the .env file in the /gee-agent/ folder
Start the web interface for ADK by running the adk web command in your browser
Navigate to https://localhost:8000 in your web browser in your development environment, select gee-agent from the dropdown, and start talking and exploring geoscience data yourself!

Once your agent is up and running, try asking about “MODIS-based vegetation” or “daily fire detections” and see what actions and insights that you and your GEE agent find while you explore together!

Earth Engine is made up of an incredible community of people, tools, and datasets. ADK is a flexible open source library for building AI agents.

Even a simple example such as the multi-agent team prototype that we’re describing here, agentic-gee-assistant , demonstrates to us that small teams of specialized AI agents can act as helpful collaborators to researchers everywhere; translating their intents into structured queries, and translating all of the results of the deep research tasks into relevant, well-contextualized answers to help the researcher.

We’re really excited to see how the Earth Engine platform and community continues to grow and evolve, and what we can do as scientists, engineers, and researchers to help! Agentic technologies, LLMs, and developer tooling like this is an exciting chance to help more researchers, educators, and decision-makers understand and adopt more widely accessible, interoperable, and helpful OSS research tools and .

Developer resources to get started with

Appendix: A few geospatial ecosystem tools that we love!

This blog post and exploratory effort was possible thanks to the helpful people, tooling, and compute resources that make up the geoscience and OSS communities, some of which are:

Google Earth Engine: A planet-scale platform for Earth science data & analysis
The Awesome GEE Community Catalog is an unfunded open source grassroots project with a mission to help collect community sourced and community generated geospatial datasets.
EE Genie is an interactive Earth Engine Gen AI assistant that works with geemap in Google Colab and can retrieve and analyze images.

Source Credit: https://medium.com/google-cloud/finding-earth-engine-data-doesnt-have-to-be-a-struggle-let-ai-agents-do-the-heavy-lifting-5b869626b506?source=rss—-e52cf94d98af—4