Use Case: Transforming Retail Logs into Customer Insights
Let’s stop looking at snippets and build something real.
For this example, we are using a raw Retail Sales Dataset. It contains transaction logs with columns like Transaction ID, Date, Customer ID, Gender, Age, Product Category, and Total Amount.
The Challenge: The data tells us what happened (e.g., A 34-year-old Male bought ‘Beauty’ products for $150), but it doesn’t tell us who they are or the context of the purchase.
- We want to segment customers into “Marketing Personas” without writing a 100-line rule engine.
- We want to identify “Holiday Rush” purchases based on dates and categories.
- We want to rank transactions by “Luxury Potential” to target high-net-worth individuals.
Step 1: Automating Customer Personas with AI.CLASSIFY
Usually, assigning a “Persona” requires complex logic: “If Age < 25 and Spend > 500 is Gen Z Spender, but if Age > 50…”
With AI.CLASSIFY, we simply feed the demographics and spending data to the model and let it categorize the customer for us.
SELECT
`Customer ID` AS ID,
Age,
Gender,
`Product Category` AS Category,
`Total Amount` AS Amount,
AI.CLASSIFY((CAST(Age as STRING), Gender, `Product Category`, CAST(`Total Amount` AS STRING)), categories => ARRAY["Trend Hunter", "Tech Enthusiast",
"Budget Shopper", "Luxury Connoisseur", "Casual Buyer"], connection_id => 'us.VertexAI-External') AS marketing_persona
FROM
dataset.retail_sales_dataset
LIMIT 10;
The Result:
- BigQuery looks at Customer 453 (Female, 26, bought Clothing for $1000) and tags her as a “Luxury Connoisseur”.
- It looks at Customer 605 (Male, 37, bought Electronics for $1000) and tags him as a “Tech Enthusiast”.
We just enriched our raw logs with qualitative marketing segments in a single query.
Step 2: Contextual Filtering with AI.IF
Our marketing team wants to send a “Post-Holiday” retention email. They need to find transactions that likely happened during a holiday rush context.
A standard SQL query would need a list of every holiday date. AI.IF understands calendars and context natively.
SELECT
`Transaction ID`,
Date,
`Product Category`,
`Total Amount`
FROM dataset.retail_sales_dataset
WHERE
-- We ask a semantic question about the date and category
AI.IF(
("Is this transaction date likely part of a major winter holiday shopping season?", cast(Date as STRING)), connection_id => 'us.VertexAI-External'
)
AND `Total Amount` > 100;
The Result:
- The query automatically filters for dates like 2023–11–24 (Black Friday window) and 2023–12–26 (Boxing Day), ignoring transactions from random dates in February or April. It understands “Winter Season” without us hard-coding BETWEEN ‘2023–11–01’ AND ‘2023–12–31’.
Step 3: Ranking Value with AI.SCORE
We want to identify “Splurge” purchases to offer a premium loyalty program. Sorting by price isn’t enough — spending $50 on “Beauty” is different than spending $50 on “Electronics.” We need a score that understands the relative value of the purchase within its category.
SELECT
`Transaction ID`,
`Product Category`,
`Price per Unit`,
`Total Amount`,
-- Score how 'luxurious' or high-end this purchase feels
AI.SCORE(
("Rate this purchase on a 'Splurge' scale of 1-10 based on the item category and price per unit.", `Product Category`, cast(`Price per Unit` as STRING)), connection_id => 'us.VertexAI-External'
) AS splurge_score
FROM dataset.retail_sales_dataset
ORDER BY splurge_score DESC
LIMIT 5;
What happens here?
The output confirms that the model applies category-specific reasoning rather than just sorting by raw numbers.
1. The High Rollers (Scores 9.0–10.0)
In the top results, we see purchases of $500 receiving nearly perfect scores. However, notice the nuance:
- Beauty Products at $500 receive a 10.0: The model understands that $500 is an exceptionally high price for a single beauty item (likely a luxury device or premium skincare), representing the absolute ceiling of that category.
- Electronics at $500 receive a 9.0: While $500 is expensive, the model likely knows that electronics pricing extends into the thousands (e.g., high-end laptops), so it holds back the perfect 10.
2. The Budget Buys (Score 1.0)
At the bottom of the list, the model consistently assigns a 1.0 to items priced at $25 — $30 across all categories. Whether it is a t-shirt (“Clothing”) or a cable (“Electronics”), Gemini correctly identifies these as entry-level, functional purchases with zero “splurge” factor.
Why this beats standard SQL:
A standard ORDER BY Price would mix these categories indiscriminately. AI.SCORE normalizes the value, allowing you to find the “top tier” customers across every category, regardless of the raw price point.
Source Credit: https://medium.com/google-cloud/sql-just-got-a-brain-introducing-bigquerys-native-ai-functions-7847778a1a8e?source=rss—-e52cf94d98af—4
