


Forecasting time series is crucial for businesses to predict trends, plan resource allocation, and make decisions, and many traditional time series forecasting methods, such as Autoregressive Integrated Moving Average (ARIMA), are limited when working with large datasets. But, Foundation models built on Large Language Models (LLMs) are revolutionizing time series forecasting.
Google Research’s TimesFM (Time Series Foundation Model) stands out as a significant contribution to this new era. Pre-trained on billions of data points, TimesFM exhibits impressive zero-shot performance, often approaching the accuracy of models trained specifically for a given dataset, but without requiring that dataset-specific training. Most importantly, TimesFM has now been built directly into BigQuery ML, making its forecasting power available as a single SQL function: AI.FORECAST, therefore, this means data analysts without ML experience can get predictions for their time series data using familiar SQL syntax.
What makes TimesFM a “foundation model”?
- Pre-trained like LLMs: TimesFM is trained on a large corpus of time series data, and this allows it to learn and generalize time patterns such as seasonality, trend, and noise.
- Zero-shot learning: TimesFM’s pre-training enables it to make accurate predictions on unseen time series datasets without the need for fine-tuning, because this contrasts with regular models.
The model learns effectively from various data and the pre-training data is large and diverse. It includes nearly real data from Google searches and Wikipedia views, and sampled data from mathematical models, because it also includes various test data sets from a wide range of fields.
In terms of design, TimesFM adopts the same model architecture as auto LLMs, a decoder-only Transformer, thus it is designed for time series data. It uses a “patch” mechanism, so it slices time series data into small segments, which are analogous to the “tokens” in text. A new feature is that it can create long patches from short ones, and this allows it to make long-term predictions more efficiently.
When using
AI.FORECAST
, you are not creating a new model or connecting to an external endpoint likegemini-2.0-flash
viaCREATE MODEL REMOTE
. Instead, you are calling a function that uses a pre-existing, managed TimesFM model within Google Cloud. There’s no need to specify a model likegemini-2.0-flash
or manage aCONNECTION
resource.
The syntax is straightforward:
SELECT *
FROM AI.FORECAST(
(query_statement) ,
data_col => 'your_value_column_name',
timestamp_col => 'your_timestamp_column_name'
-- Optional, defaults to TimesFM 2.0
-- [, model => 'TimesFM 2.0']
-- Optional, for multiple series
[, id_cols => ['id_column_1', 'id_column_2']]
-- Optional, default 10
[, horizon => number_of_steps_to_forecast]
-- Optional, default 0.95
[, confidence_level => probability_value]
);
Key Parameters:
- data_col: The column containing numerical values to forecast.
- timestamp_col: The column with time points (DATE, TIMESTAMP)
- horizon: Number of future time points to predict.
- confidence_level: Probability for prediction intervals (e.g., 0.95 for 95%).
Let’s consider a Fintech company that needs to forecast daily transaction volumes for different mobile payment channels. Accurate forecasts help in liquidity management, fraud detection, and resource allocation.
Scenario: Predict the daily number of transactions for various mobile payment channels (e.g., App Payments, QR Code Payments, SMS Payments) for the next 7 days.
Dummy Dataset: We’ll create a dummy BigQuery table representing historical daily transaction volumes.
-- Create a dummy table for mobile payment transactions
CREATE OR REPLACE TEMP TABLE MobilePayments AS
SELECT
DATE('2025-01-01') AS transaction_date,
'App Payments' AS channel,
CAST(1500 + RAND() * 500 AS INT64) AS transactions_volume -- Base volume + random noise
UNION ALL
SELECT
DATE('2025-01-02') AS transaction_date,
'App Payments' AS channel,
CAST(1550 + RAND() * 500 AS INT64) AS transactions_volume
UNION ALL
-- ... Add more rows for 'App Payments' and other channels
-- spanning a reasonable period (e.g., 1 year daily data)
SELECT
DATE('2025-01-01') AS transaction_date,
'QR Code Payments' AS channel,
CAST(800 + RAND() * 300 AS INT64) AS transactions_volume
UNION ALL
SELECT
DATE('2025-01-02') AS transaction_date,
'QR Code Payments' AS channel,
CAST(820 + RAND() * 300 AS INT64) AS transactions_volume
-- ... Add more data for 'QR Code Payments'
UNION ALL
SELECT
DATE('2025-01-01') AS transaction_date,
'SMS Payments' AS channel,
CAST(300 + RAND() * 100 AS INT64) AS transactions_volume
UNION ALL
SELECT
DATE('2025-01-02') AS transaction_date,
'SMS Payments' AS channel,
CAST(310 + RAND() * 100 AS INT64) AS transactions_volume
-- ... Add more data for 'SMS Payments'
-- Ensure data for each channel spans the same date range.
-- For a real scenario, you'd use your actual historical data.
;
Generating Forecasts with AI.FORECAST
:
To forecast daily transaction volume for all channels simultaneously for the next 7 days:
-- Forecast daily transaction volume for all channels for the next 7 days
SELECT *
FROM AI.FORECAST(
TABLE MobilePayments,
data_col => 'transactions_volume', -- Column with the numerical values
timestamp_col => 'transaction_date', -- Column with the time points
id_cols => ['channel'], -- Column identifying each separate time series (payment channel)
horizon => 7, -- Forecast 7 days ahead
confidence_level => 0.90 -- Request a 90% prediction interval
);
This single query leverages the power of TimesFM to generate forecasts for each distinct payment channel identified in the channel
column.
Interpreting the Results:
The AI.FORECAST
function returns a table with columns like:
channel
: Identifier for the specific time series (e.g., ‘App Payments’).forecast_timestamp
: The date for each forecasted point.forecast_value
: The model’s predicted transaction volume for that date.prediction_interval_lower_bound
andprediction_interval_upper_bound
: The range within which the actual value is expected to fall, based on the specifiedconfidence_level
.confidence_level
: The requested confidence level (e.g., 0.90).ai_forecast_status
: Indicates if the forecast for that series was successful or contains an error message.
Source Credit: https://medium.com/google-cloud/forecasting-the-future-with-bigqueryml-timesfm-a-game-changer-in-time-series-analysis-43a627151118?source=rss—-e52cf94d98af—4