
At Google Cloud Next 25, we announced a major step forward in geospatial analytics: Earth Engine in BigQuery. This new capability unlocks Earth Engine raster analytics directly in BigQuery, making advanced analysis of geospatial datasets derived from satellite imagery accessible to the SQL community.
Before we get into the details of this new capability and how it can power your use cases, it’s helpful to distinguish between two types of geospatial data and where Earth Engine and BigQuery have historically excelled:
-
Raster data: This type of data represents geographic information as a grid of cells, or pixels, where each pixel stores a value that represents a specific attribute such as elevation, temperature, or land cover. Satellite imagery is a prime example of raster data. Earth Engine excels at storing and processing raster data, enabling complex image analysis and manipulation.
-
Vector data: This type of data represents geographic features such as points, lines, or polygons. Vector data is ideal for representing discrete objects like buildings, roads, or administrative boundaries. BigQuery is highly efficient at storing and querying vector data, making it well-suited for large-scale geographic analysis.
Earth Engine and BigQuery are both powerful platforms in their own right. By combining their geospatial capabilities, we are bringing the best of both raster and vector analytics to one place. That’s why we created Earth Engine in BigQuery, an extension to BigQuery’s current geospatial capabilities that will broaden access to raster analytics and make it easier than ever before to answer a wide range of real-world enterprise problems.
Earth Engine in BigQuery: Key features
You can use the two key features of Earth Engine in BigQuery to perform raster analytics in BigQuery:
-
A new function in BigQuery: Run
ST_RegionStats()
, a new BigQuery geography function that lets you efficiently extract statistics from raster data within specified geographic boundaries. -
New Earth Engine datasets in BigQuery Sharing (formerly Analytics Hub): Access a growing collection of Earth Engine datasets in BigQuery Sharing (formerly Analytics Hub), simplifying data discovery and access. Many of these datasets are analysis-ready, immediately usable for deriving statistics for an area of interest, and providing valuable information such as elevation, emissions, or risk prediction.
Five easy steps to raster analytics
The new ST_RegionStats()
function is similar to Earth Engine’s reduceRegion function, which allows you to compute statistics for one or more regions of an image. The ST_RegionStats()
function is a new addition to BigQuery’s set of geography functions invoked as part of any BigQuery SQL expression. It takes an area of interest (e.g., a county, parcel of land, or zip code) indicated by a geography and an Earth Engine-accessible raster image and computes a set of aggregate values for the pixels that intersect with the specified geography. Examples of aggregate statistics for an area of interest would be maximum flood depth or average methane emissions for a certain county.
These are the five steps to developing meaningful insights for an area of interest:
-
Identify a BigQuery table with vector data: This could be data representing administrative boundaries (e.g., counties, states), customer locations, or any other geographic areas of interest. You can pull a dataset from BigQuery public datasets or use your own based on your needs.
-
Identify a raster dataset: You can discover Earth Engine raster datasets in BigQuery Sharing, or you can use raster data stored as a Cloud GeoTiff or Earth Engine image asset. This can be any raster dataset that contains the information you want to analyze within the vector boundaries.
-
Use ST_RegionStats() to bring raster data into BigQuery: The
ST_RegionStats()
geography function takes the raster data (raster_id
), vector geometries (geography
), and optional band (band_name
) as inputs and calculates aggregate values (e.g., mean, min, max, sum, count) on the intersecting raster data and vector feature. -
Analyze the results: You can use the output of running
ST_RegionStats()
to analyze the relationship between the raster data and the vector features, generating valuable insights about an area of interest. -
Visualize the results: Geospatial analysis is usually most impactful when visualized on a map. Tools like BigQuery Geo Viz allow you to easily create interactive maps that display your analysis results, making it easier to understand spatial patterns and communicate findings.
Toward data-driven decision making
The availability of Earth Engine in BigQuery opens up new possibilities for scaled data-driven decision-making across various geospatial and sustainability use cases, by enabling raster analytics on datasets that were previously unavailable in BigQuery. These datasets can be used with the new ST_RegionStats()
geography function for a variety of use cases, such as calculating different land cover types within specific administrative boundaries or analyzing the average elevation suitability within proposed development areas. You can also find sample queries for these datasets in BigQuery Sharing’s individual dataset pages. For example, if you navigate to the GRIDMET CONUS Drought Indices dataset page, you can find a sample query for calculating mean Palmer Drought Severity Index (PDSI) for each county in California, used to monitor drought conditions across the United States.
Let’s take a deeper look at some of the use cases that this new capability unlocks:
1. Climate, physical risk, and disaster response
Raster data can provide critical insights on weather patterns and natural disaster monitoring. Many of the raster datasets available in BigQuery Sharing provide derived data on flood mapping, wildfire risk assessment, drought conditions, and more. These insights can be used for disaster risk and response, urban planning, infrastructure development, transportation management, and more. For example, you could use the Wildfire Risk to Communities dataset for predictive analytics, allowing you to assess wildfire hazard risk, exposure of communities, and vulnerability factors, so you can develop effective resilience strategies. For flood mapping, you could use the Global River Flood Hazard dataset to understand regions in the US that have the highest predicted inundation depth, or water height above ground surface.
2. Sustainable sourcing and agriculture
Raster data also provides insights on land cover and land use over time. Several of the new Earth Engine datasets in BigQuery include derived data on terrain, elevation, and land-cover classification, which are critical inputs for supply chain management and assessing agriculture and food security. For businesses that operate in global markets, sustainable sourcing requires bringing transparency and visibility to supply chains, particularly as regulatory requirements are shifting commitments to deforestation-free commodity production from being voluntary to mandatory. With the new Forest Data Partnership maps for cocoa, palm and rubber, you can analyze where commodities are grown over time, and add in the Forest Persistence or the JRC Global Forest Cover datasets to understand if those commodities are being grown in areas that had not been deforested or degraded before 2020. With a simple SQL query, you could, for instance, determine the estimated fraction of Indonesia’s land area that had undisturbed forest in 2020.
3. Methane emissions monitoring
Reducing methane emissions from the oil and gas industry is crucial to slow the rate of climate change. The MethaneSAT L4 Area Sources dataset, which can be used as an Earth Engine Image asset with the ST_RegionStats()
function, provides insights into small, dispersed area emissions of methane from various sources. This type of diffuse but widespread emissions can make up the majority of methane emissions in an oil and gas basin. You can analyze the location, magnitude, and trends of these emissions to identify hotspots, inform mitigation efforts, and understand how emissions are characterized across large areas, such as basins.
4. Custom use cases
In addition to these datasets, you can bring your own raster datasets via Cloud Storage GeoTiffs or Earth Engine image assets, to support other use cases, while still benefiting from BigQuery’s scalability and analytical tools.
Source Credit: https://cloud.google.com/blog/products/data-analytics/a-closer-look-at-earth-engine-in-bigquery/