Data volumes keep exploding, yet most raw files still live in cheap cloud storage. BigLake lets me query files stored in Google Cloud Storage, Amazon S3, or Azure Data Lake using standard SQL — as if they were native BigQuery tables, but with added row- and column-level security.
Below is my own practical walk-through for absolute beginners: no training-portal jargon, just clear steps, fresh examples, and real-world analogies. Based on the hands-on lab: BigLake: Qwik Start (GSP1040)
In this guide:
1. Set up a Connection Resource
2. Grant secure access to files
3. Create your first BigLake table
4. Add column-level security
5. Upgrade an external table to BigLake
Why it matters:
As organizations collect more semi-structured and unstructured data, traditional data warehouses can no longer keep up. BigLake bridges the gap between cheap object storage and powerful analytics without compromising on security or performance.
What it is
A Connection Resource is a dedicated service account. BigQuery routes every file request through this account instead of poking the bucket directly.
Think of a public library: readers stay in the reading room while a curator fetches books from the archive. The curator (Connection) guarantees that no rare volume “disappears in a backpack.”
Open IAM, find the service account of the connection, and grant the role Storage Object Viewer.
Now the curator can read files but cannot delete or overwrite them.
Good to know
The moment the connection works, I strip direct bucket access from ordinary users. That way no one can sidestep my security policies by downloading the raw files.
- Dataset — my logical shelf in BigQuery, let’s call it
demo_dataset.
2. Table type — External Table, then tick “Use BigLake.”
3. File source — in the demo it’s customer.csv, but in real life it might be Parquet sensor logs or Avro clickstreams.
[
{
"name": "customer_id",
"type": "INTEGER",
"mode": "REQUIRED"
},
{
"name": "first_name",
"type": "STRING",
"mode": "REQUIRED"
},
{
"name": "last_name",
"type": "STRING",
"mode": "REQUIRED"
},
{
"name": "company",
"type": "STRING",
"mode": "NULLABLE"
},
{
"name": "address",
"type": "STRING",
"mode": "NULLABLE"
},
{
"name": "city",
"type": "STRING",
"mode": "NULLABLE"
},
{
"name": "state",
"type": "STRING",
"mode": "NULLABLE"
},
{
"name": "country",
"type": "STRING",
"mode": "NULLABLE"
},
{
"name": "postal_code",
"type": "STRING",
"mode": "NULLABLE"
},
{
"name": "phone",
"type": "STRING",
"mode": "NULLABLE"
},
{
"name": "fax",
"type": "STRING",
"mode": "NULLABLE"
},
{
"name": "email",
"type": "STRING",
"mode": "REQUIRED"
},
{
"name": "support_rep_id",
"type": "INTEGER",
"mode": "NULLABLE"
}
]
4. Schema — here I define columns up front so I can demonstrate column-level security later.
Why BigLake beats a basic external table
I instantly gain:
— detailed audit logs,
— row- and column-masking,
— smart caching that saves scan costs.
If I see rows, everything is wired correctly: BigQuery found the file, the connection unlocked it, permissions are aligned.
Fun fact
Because BigLake uses the same engine that powers the BigQuery Storage API, the same file can be read by Spark, Presto, or a Jupyter notebook — each tool will still respect the security tags I set.
- I create a taxonomy in Data Catalog (for example, “Sensitive”).
- I tag columns
address,postal_code, andphone.
3. Now need to Add policy tag. For this click taxonomy name to expand it to select biglake-policy.
Then click Select and your columns should now have the policy tags attached to them.
Now save it. Your table schema now resembles the following
4. I run two queries:
- a full
SELECT *returns Access Denied,
SELECT * EXCEPT(address, phone, postal_code)runs fine.
Below is the exact, screen-for-screen procedure — first you build a plain external table in the web console, then you attach a Cloud Resource connection with three command-line calls.
Build the external table in the BigQuery UI
- Open BigQuery → Studio and hover over the dataset
demo_dataset. Click the ⋮ (three dots) and choose Create table. - In Create table from, pick Google Cloud Storage.
- Click Browse, navigate into the bucket that carries your Project ID, select
invoice.csv, and hit Select. - Under Destination, confirm your project and the dataset
demo_dataset. - Type
external_tableas the table name. - Set Table type to External Table.
Important: do not choose a Cloud Resource connection yet. - Scroll to Schema, switch on Edit as text, and paste the JSON schema:
[
{
"name": "invoice_id",
"type": "INTEGER",
"mode": "REQUIRED"
},
{
"name": "customer_id",
"type": "INTEGER",
"mode": "REQUIRED"
},
{
"name": "invoice_date",
"type": "TIMESTAMP",
"mode": "REQUIRED"
},
{
"name": "billing_address",
"type": "STRING",
"mode": "NULLABLE"
},
{
"name": "billing_city",
"type": "STRING",
"mode": "NULLABLE"
},
{
"name": "billing_state",
"type": "STRING",
"mode": "NULLABLE"
},
{
"name": "billing_country",
"type": "STRING",
"mode": "NULLABLE"
},
{
"name": "billing_postal_code",
"type": "STRING",
"mode": "NULLABLE"
},
{
"name": "total",
"type": "NUMERIC",
"mode": "REQUIRED"
}
]
8. Click Create table.
You now have an external table that reads the CSV file directly but still lacks BigLake features.
Generate a BigLake-ready manifest
Open Cloud Shell and run:
# Save your project ID
export PROJECT_ID=$(gcloud config get-value project)# Build a table definition that embeds the connection
bq mkdef \
--autodetect \
--connection_id=$PROJECT_ID.US.my-connection \
--source_format=CSV \
"gs://$PROJECT_ID/invoice.csv" \
> /tmp/tabledef.json
(The --connection_id line is what turns a plain external table into a BigLake candidate.)
Capture the current schema
bq show --schema --format=prettyjson demo_dataset.external_table > /tmp/schema
Attach the connection to the live table
bq update \
--external_table_definition=/tmp/tabledef.json \
--schema=/tmp/schema \
demo_dataset.external_table
The command returns silently when the update succeeds.
Source Credit: https://medium.com/google-cloud/from-csv-to-secure-analytics-a-hands-on-beginners-guide-to-biglake-in-bigquery-2025-edition-ad34badaf983?source=rss—-e52cf94d98af—4
