Easy analytics using LangGraph checkpoints with BigQuery and Cloud SQL | by Wei Hsia | Google Cloud - Community

LangGraph is a popular framework to use when building an agent. One of the interesting features is that it provides a persistence layer through the use of checkpoints.

These checkpoints provide a functional persistence layer for your agent. These can be ephemeral (in memory), persisted in a database such as Postgres, or other tools. They also contain a wealth of information. For example, when you need a human in the loop to evaluate the efficacy of your agent.

There are many ways you can extract this data out, save it, and analyze it. One way is to use BigQuery’s Cloud SQL federated queries.

This blog assumes that you have some familiarity with LangGraph and Cloud SQL.

To get some data, you will need to create a simple chatbot with LangGraph.

Create some with the code below, it only has one agent and the agent’s job is to simply repeat what the input was.

This chatbot uses VertexAI’s Gemini Flash 2.0 API. You can replace the model if you don’t have access to this API. Replace the line model = ChatVertexAI(model_name=”gemini-2.0-flash-001″) to update the model to your desired model.

You’ll also need to create a Cloud SQL database and establish a database connection with it. For the purposes of this blog, it assumes you have the connection between your agent and the Cloud SQL backend established. If you need a quick way to connect to Cloud SQL with Python, use Cloud Shell.

You can also follow similar steps in this blog to establish a connection to Cloud SQL. (The blog does it for AlloyDB but you can borrow the concepts for Cloud SQL as well.)

Depending on the method you choose to connect to the database, you need to update the database URI. In the code, the line you need to augment is: DB_URI = f”postgresql://{db_user}:{db_password}@localhost:5432/postgres?sslmode=disable”

The code also assumes you’ve saved the user and password to postgresuser and postgrespass environment variables.

# Copyright 2025 Google LLC
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#     https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.from typing import Annotated
from typing_extensions import TypedDict
from langgraph.graph import StateGraph, START, END, MessagesState
from langgraph.graph.message import add_messages
from langchain_google_vertexai import ChatVertexAI
from langgraph.checkpoint.postgres import PostgresSaver
from psycopg_pool import ConnectionPool
import os
db_user = os.environ['postgresuser']
db_password = os.environ['postgrespass']
class State(TypedDict):
# Messages have the type "list". The `add_messages` function
# in the annotation defines how this state key should be updated
# (in this case, it appends messages to the list, rather than overwriting them)
messages: Annotated[list, add_messages]
def chatbot_supervisor (
state: MessagesState
):
system_prompt = (
"You are a repeat bot add 'thanks, you sent me this, I'm sending you this back:' at the beginning of each returned message." +
"Then append the message that was sent in verbatim."
)
messages = [{"role": "system", "content": system_prompt}] + state["messages"]
ai_msg = model.invoke(messages)
return {"messages": [ai_msg]}
def stream_graph_updates(user_input: str, config: dict):
for event in graph.stream({"messages": [("user", user_input)]},config):
print (event)
# Instantiate the model object
model = ChatVertexAI(model_name="gemini-2.0-flash-001")
DB_URI = f"postgresql://{db_user}:{db_password}@localhost:5432/postgres?sslmode=disable"
connection_kwargs = {
"autocommit": True,
"prepare_threshold": 0,
}
user_input = ""
# Using a pooled connection to connect to Postgres
with ConnectionPool(
# Example configuration
conninfo=DB_URI,
max_size=20,
kwargs=connection_kwargs,
) as pool:
# Building out the graph
graph_builder = StateGraph(State)
graph_builder.add_node("chatbot_supervisor", chatbot_supervisor)
graph_builder.add_edge(START, "chatbot_supervisor")
graph_builder.add_edge("chatbot_supervisor", END)
checkpointer = PostgresSaver(pool)
checkpointer.setup()
graph = graph_builder.compile(checkpointer=checkpointer)
while (not user_input == "exit"):
user_input = input("User: ")
config = {"configurable": {"thread_id": "demo_purposes"}}
stream_graph_updates(user_input,config)

When you run the code and submit a prompt, the bot will repeat back and provide you with an additional set of state information. That information is part of the checkpoint data and now saved in Cloud SQL.

User: Hello

{‘chatbot_supervisor’: {‘messages’: [AIMessage(content=”thanks, you sent me this, I’m sending you this back: Hello\n”, additional_kwargs={}, response_metadata={‘is_blocked’: False, ‘safety_ratings’: [], ‘usage_metadata’: {‘prompt_token_count’: 41, ‘candidates_token_count’: 17, ‘total_token_count’: 58, ‘cached_content_token_count’: 0}, ‘finish_reason’: ‘STOP’, ‘avg_logprobs’: -0.010888544952168185}, id=’…’, usage_metadata={‘input_tokens’: 41, ‘output_tokens’: 17, ‘total_tokens’: 58})]}}

Visit the Cloud SQL console and navigate to your instance.

Expand the database you chose to find the checkpoint tables created by LangGraph.

In the checkpoints table, search for the thread_id demo_purposes. In the code above, you used this as the thread id. Executing the below SQL will return to you all of the checkpoints of your chatbot.

select *
from public.checkpoints
where thread_id = 'demo_purposes'

BigQuery Cloud SQL federation allows you to query data residing in Cloud SQL in real time without copying or moving your data. This simplifies your tech stack and lets you quickly access any necessary data. You can even use this as an ETL (extract, transform, load) or ELT (extract, load, transform) method.

The queries are made available in BigQuery using the EXTERNAL_QUERY function which allows you to send the queries down to CloudSQL and results are returned to BigQuery.

You will need to first create a Cloud SQL connection, you can follow these instructions to do so. Once you create the connection you will be able to see it in the explorer panel on the left hand side of the console.

You’ll need the connection ID to run the query, which you can get by clicking the 3 dots and selecting “copy ID”.

Once you have the connection created, you can go ahead and use the SQL below to reach out to Cloud SQL from BigQuery!

Replace the “projects//locations/us/connections/private-ip-sql-instance-connection” with your connection ID that you copied from the step before.

SELECT *
FROM EXTERNAL_QUERY(
"projects//locations/us/connections/private-ip-sql-instance-connection",
"select * from public.checkpoints where thread_id = 'demo_purposes';");

You should see the same results as you saw when you ran it in Cloud SQL.

You’ve successfully queried your LangGraph checkpoint data in BigQuery. You can now perform analysis, evaluations, and anything you need with this data with the power of BigQuery.

You can use another LLM to evaluate or perform sentiment analysis and group the results, right within BigQuery using BigQuery ML. Try it out with another query! (Hint: in order to convert bytes in BigQuery, you’ll want to use a function like SAFE_CONVERT_BYTES_TO_STRING.)

Now that you have all the plumbing ready and can access the data in BigQuery, try using this as a light-weight ELT method where you create a table from the SQL results.

You can also explore doing the same setup with AlloyDB. Happy querying (and analysis)!

Source Credit: https://medium.com/google-cloud/easy-analytics-using-langgraph-checkpoints-with-bigquery-and-cloud-sql-f9b863828250?source=rss—-e52cf94d98af—4