This is a comprehensive handbook on sector wise use cases and best practices for data products in Google Cloud’s Knowledge Catalog
Introduction to the data products
Centralized data architectures often create operational bottlenecks and result in opaque “data swamps” of ambiguous, low-quality data. To empower consumers, the industry is shifting to a decentralized approach: Data Mesh, which treats data as a discoverable, trustworthy, secure, and interoperable product.
Google Cloud facilitates this modern architectural transition through the Knowledge Catalog, a universal context engine that operationalizes data products as curated, purposeful logical units (akin to a well-labeled product on a shelf) while simultaneously providing the high-quality context necessary to ground and accelerate advanced AI and agentic initiatives.
Anatomy of a data product
In Knowledge Catalog, a data product packages data assets (e.g. BigQuery tables, views, cloud storage etc.) within a semantic wrapper of context, trust, and governance. The four fundamental pillars of data product are:
- Design for use case: Bundles data assets (pointers to physical resources like BigQuery tables or GCS paths) for a specific business outcome, eliminating the need for users to “hunt” for individual tables.
- Context: Enriches data with insights, sample queries, documentation, and structured metadata to provide business context.
- Access Groups: Simplifies governance by mapping functional roles called access groups (e.g., “Analyst”, “Reader”) to automatic IAM bindings across all underlying assets, streamlining access workflows.
- Contracts: Established trust and communicates contractual guarantees.
This unified packaging enables a frictionless discovery-to-consumption lifecycle. For example, a data scientist analyzing marketing ROI no longer needs to query multiple systems or wait weeks or access approvals. Instead, they search the catalog for the “Marketing Campaign Analysis” product, review its context and SLAs, and request access with a single click—instantly unlocking all required assets.
Classification of data products
Successful implementations of a modern decentralized architecture do not simply label every existing table as a “product.” Instead, they organize data products into a highly structured typology. This architectural layering ensures that data is managed appropriately at every stage of its lifecycle, from raw operational ingestion to highly refined business intelligence. Data products can generally be categorized into three distinct layers: Source-Aligned, Domain-Aligned (Aggregate), and Consumption-Aligned.
Table 1: Comparing source vs domain vs consumption aligned data products
Sector wise use cases
Let’s apply this framework into practical examples from retail, telecommunications, and financial services sectors and how they can use data products to transform everyday business operations.
Retail: Personalization and inventory optimization
In modern retail, supply chain agility and customer personalization define market leaders. Retailers must rapidly synthesize vast amounts of point-of-sale (POS) data, e-commerce traffic logs, and complex inventory metrics to remain competitive.
Image 1: Source vs Domain vs Consumption data products for retail. Generated by Gemini
Table 2: Comparing source vs domain vs consumption data products for retail
Telecommunications: From raw telemetry to churn prediction
Telecommunications operators possess highest-velocity data assets, network telemetry, and billing events daily. The siloed ownership where network engineers view telemetry to assess cell tower performance, while customer support views billing records can be brought together for a unified view of the customer experience, leading to reactive rather than proactive customer retention strategies.
Image 2: Source vs Domain vs Consumption data products for telecom. Generated by Gemini
Table 3: Comparing source vs domain vs consumption data products for telecom
Financial services: Fraud detection and regulatory compliance
The financial sector operates under regulatory requirements, where data auditability, and absolute accuracy are non-negotiable. Data products in this environment serve as vital mechanisms for risk mitigation, fraud prevention, and regulatory compliance.
Image 3: Source vs Domain vs Consumption data products for finance. Generated by Gemini
Table 4: Comparing source vs domain vs consumption data products for finance
Best practices
Managing the lifecycle of a data product in Knowledge Catalog—from initial creation and discovery to contract enforcement and access provisioning—requires adherence to several best practices.
1. Curate for specific business use cases
Instead of bundling random data assets, treat a data product as a logical unit of distribution for a specific business outcome (e.g., “Marketing Campaign ROI” or “Customer 360”).
- Asset Selection: Include only well-co related, high-quality data assets that directly contribute to the use case. For e.g. Leverage correlations data insights or cross project relationship at table to auto identify closely related tables.
- Avoid “Data Swamps”: Exclude temporary tables, staging data, or experimental data. A data product should be “production-ready” on the shelf.
- Limit number of assets: Keep the number of assets per data product to <50 to ensure minimum noise.
2. Contextualization for agent grounding
One of the most powerful uses for data products is providing well constructed context for grounding agents. AI agents struggle with “just” data; they need business logic to be accurate.
- Contextual Insights: Auto Generate rich text documentation, asset description and sample insights using Knowledge catalog’s data insights embedded within the data product. This allows AI agents to “read” the context and generate accurate, grounded responses.
- Use Aspects: Attach structured metadata (Aspects) to define business metadata, and logic.
- Use Glossary: Ensure glossary terms are tagged to your raw data assets
- Use contracts: Define contract and certify data product for human and agent consumption
3. Streamline access with access groups
Data Products allow you to manage individual asset level permissions at the Data Product level using Access Groups.
- Access Groups: Create easy to understand access groups like “Reader - Marketing” or “Reader - Other team” or “Writer”. Map these access groups to appropriate google groups or service accounts typically aligned to teams like “marketing@acme.com” , “non-marketing@acme.com” , “marketing-writer@acme.com”
- Access approval: You can leverage existing groups (e.g.,
marketing@acme.com) rather than individual accounts. Assigning access to a group automatically provisions permissions to all its members, eliminating individual approval bottlenecks.
4. Consuming data products
Data Products can be consumed by humans or agents
- For human users: Discover your data products in Knowledge Catalog search or Knowledge Catalog’s business user interface. Request access and track your data products that you have gained access to by marking them as starred in knowledge catalog search or view them in “My requests” in Knowledge Catalog workflows. Use the assets packaged in data directly in the product service like BigQuery etc.
- For 3P agents: Leverage OneMCP to integrate knowledge catalog tools including data products for your agents, create set of instructions and leverage data products for grounding agents into context and correlation for accurate data insights
Conclusion
The shift toward a decentralized architecture—operationalized through Google Cloud Knowledge Catalog—transforms raw enterprise information to high-velocity data products. This product-oriented shift is the essential prerequisite for agent grounding; it moves generative models away from the risks of “hallucination” and anchors them in a contextual, trusted and governed truth.






