Many organizations are overwhelmed with large amounts of fragmented data and are unclear on how it impacts their business objectives. This creates a critical disconnect: data consumers — analysts or data scientists that need the data to generate insights — can’t easily discover, access, or trust the data they need, while data producers — the teams who own these data assets — can’t enable consumers with self-service access to that data. Without easy access to reliable, context-rich data, organizations can struggle to adopt AI and agent technologies.
To help organizations overcome these challenges, we are introducing data products in Dataplex Universal Catalog, Google Cloud’s unified, intelligent data to AI governance solution. A data product is a curated, ready-to-use package of data assets, documentation, and governance controls, all purposefully assembled to solve a specific business problem. More than just data, data products can help you demonstrate business value and fuel AI innovation within your organization. Data products are available now in preview.
Understanding data products
At its core, a data product is a logical unit of distribution that models how a group of assets address a business problem. We think of it as a product on a shelf, complete with a label, instructions, and quality guarantees, but for data. It abstracts raw data assets into trusted, discoverable, and valuable resources for the entire organization. This allows data teams to more efficiently:
-
Define expectations: Instead of answering the same ad-hoc questions again and again, data producers can catalog information about data quality, freshness, and intended use cases directly within the data product’s documentation and contracts.
-
Reduce management toil: Data products allow you to group assets logically by use case. This simplifies access management, reducing the manual effort required to manage individual assets.
-
Demonstrate value: By linking data assets directly to the business use cases they serve, data teams can clearly demonstrate the value they create and justify their budget spend based on impact, not just history.
How to use data products
At a high level, data products deliver the following foundational capabilities:
-
Design for use case: Identify the business problem and model the data product to solve for a use case.
-
Establish ownership: Define the ownership of the data product to ensure accountability.
-
Democratize context: Document the problems the product addresses with usage examples and expectations.
-
Define contracts: Provide trust to consumers and communicate contractual guarantees.
-
Govern assets: Administer who can view the product and regulate access to the data assets.
-
Enable discovery: Help data consumers easily discover and request access for data products.
-
Evolve offerings: Iterate and evolve the product to address consumer needs.
What does that mean in practice? Imagine you’re a data producer for a marketing team. Data consumers such as data scientists in your organization consistently need to analyze quarterly campaign performance to recommend future adjustments. Here’s how you can empower them with a “Marketing campaign analysis” data product:
-
Create the data product: Start by creating a new data product named “Marketing campaign analysis.” You assign yourself as the owner and provide contact details.
-
Curate cross-project assets: You then add the relevant assets needed for the analysis. For example you can include BigQuery tables and views:
ad_spend_daily,customer_conversions,website_traffic_logs. -
Define roles and permissions: To govern and manage access on the assets in your data product, create a
data_scientistgroup and grant this groupvieweraccess on all of the assets. -
Establish a data contract: To build trust, specify the refresh frequency of data products and communicate the contract terms.
- Add rich documentation: Finally, add a detailed description explaining that this data product is the single source of truth for campaign analysis. You include examples of SQL queries and links to additional artifacts.
Source Credit: https://cloud.google.com/blog/products/data-analytics/introducing-data-products-in-dataplex-universal-catalog/
