Now available: Direct Importing from BigQuery to Vector Search | by Kaz Sato | Google Cloud - Community

For developers building AI-powered applications, bridging the gap between analytical data stores and low-latency serving engines is a common challenge. The new BigQuery import feature for Vertex AI Vector Search, now in Public Preview, offers a powerful solution.

This feature is a game-changer for any developer who has struggled with the complexities of moving data from a data warehouse like BigQuery into a high-performance vector database.

BigQuery is a popular choice for storing and managing large datasets, including the vector embeddings and metadata that power AI applications. However, once that data is in BigQuery, a significant challenge has been how to efficiently get it into a low-latency serving engine like Vector Search.

Before this new import feature, the process was cumbersome and required writing custom code to orchestrate a multi-step pipeline:

Export from BigQuery to Cloud Storage: First, you had to write a script to export your data from a BigQuery table into a file format (like JSON or CSV) and save it in a Cloud Storage bucket.
Import from Cloud Storage to Vector Search: Next, you had to write another piece of code to trigger the import process from Cloud Storage into your Vector Search index, ensuring the data was correctly formatted.

This manual, two-hop process was not only inefficient but also introduced multiple points of failure and required developers to build, manage, and maintain custom ETL code. It was a common bottleneck that slowed down development and added unnecessary complexity.

The new BigQuery import feature for Vector Search completely changes the game. It provides a direct, one-time import mechanism that allows you to populate a Vector Search index directly from a BigQuery table.

Press enter or click to view image in full size

Note: This is an import tool, not a real-time data synchronization pipeline as of Aug 2025. It’s designed to simplify and automate the process of loading data that has been prepared in BigQuery into Vector Search for low-latency serving.

Here are its key benefits:

Eliminates Manual ETL: You can now populate a Vector Search index directly from your BigQuery tables, removing the need for manual export and format scripts.
Effortless Metadata Transfer: It’s easy to specify which columns from a BigQuery table should be included as metadata in the Vector Search index. This is huge for providing rich, contextual results to users.
Enables Low-Latency Applications: By simplifying the data loading process, it becomes much easier to build applications that can query the data in Vector Search with high performance.

For those who want to get their hands dirty, here’s a quick walkthrough of the process.

First, make sure your BigQuery table is set up correctly. You’ll need a few key columns:

A Unique Identifier: This will be your id in Vector Search.
Vector Embeddings: This column must be a repeated FLOAT field.
Optional Metadata: You can also include columns for restricts (for filtering) and any other metadata you want to retrieve with your search results.

Next, you need a destination for your data. Make sure you have a Vector Search index created in your project. The dimensionality of this index must match the dimensionality of the embeddings in your BigQuery table.

The magic happens with the ImportIndex API. You’ll send a request that specifies the target index and the BigQuery source configuration. You can choose to either overwrite the existing data in the index or append the new data.

Here’s what the ImportIndexRequest looks like for a one-time import:

{
"name": "projects/[PROJECT_ID]/locations/[LOCATION]/indexes/[INDEX_ID]",
"isCompleteOverwrite": true,
"config": {
"bigQuerySourceConfig": {
"tablePath": "[PROJECT_ID].[DATASET_ID].[TABLE_ID]",
"datapointFieldMapping": {
"idColumn": "[ID_COLUMN_NAME]",
"embeddingColumn": "[EMBEDDING_COLUMN_NAME]",
"restricts": [
{
"namespace": "genre",
"allowColumn": ["genre_allow_column"]
}
],
"numericRestricts": [
{
"namespace": "year",
"valueColumn": "year_column",
"valueType": "INT"
}
],
"metadataColumns": ["title", "description"]
}
}
}
}

Once you send this request, the import process will begin as a long-running operation. You can monitor its progress using the operation ID. After the import is complete, your data is ready to be queried in Vector Search.

For a deeper dive, check out the official documentation.

If you’re a developer who uses BigQuery as a source of truth for embeddings and metadata, and you need to serve that data in a low-latency, online application, then this import feature is for you. It’s perfect for use cases like:

Product recommendations
Semantic search
RAG and agentic applications

This new importer can save developers a ton of time and effort by simplifying the data loading workflow. If you’re building AI applications on Google Cloud, it’s highly recommended to give it a try.

Source Credit: https://medium.com/google-cloud/now-available-direct-importing-from-bigquery-to-vector-search-f4ba01b48dfd?source=rss—-e52cf94d98af—4