• Centralize metadata access across all data assets 
  • Achieve holistic data product reliability
  • Implement user-centric data discovery approach
  • Establish collaborative data product development between analysts and engineers

From the initial events that capture user activity to the final Looker dashboards that make insights consumable for the business, DataHub provides detailed lineage and quality information critical for maintaining data reliability.

RONALD ANGEL

Data Products Manager, Miro


The Topline

Challenge
Lacked alignment on data quality across teams due to fragmented metadata and a non-intuitive discovery experience

Solution
Adopted DataHub Cloud as a centralized metadata management platform to improve data product reliability and enable intuitive, user-friendly discovery

Impact
Improved data reliability, increased transparency, and fostered shared ownership across technical and business stakeholders

Note: This story was originally published September 2024.

Challenge

Miro’s data engineering team needed to deliver reliable data products across a growing analytics ecosystem, but their initial setup fell short.

They had adopted Airflow as the central metadata hub for SLA validation, but the approach left challenges in aligning on quality with all data consumers across the organization. 

To improve data reliability, Miro’s data engineering team had to address the following obstacles:

  • Data contracts were too technical. Contracts lived in engineering-owned repos and referenced internal task names that were unintelligible to analytics users. Business stakeholders couldn’t map technical references back to their domains or use cases.
  • Notifications lacked context. Airflow alerts focused on pipeline statuses and technical metrics, offering little guidance on business relevance or next steps.
  • Lineage was opaque. Understanding dependencies required deep technical help. There was no standardized way to show how inputs like user entities and attribution models connected to business metrics like revenue.
  • Uptime measurement was misleading. Uptime metrics primarily based on Airflow task statuses provide freshness indicators but did not assess overall data quality, particularly regarding complex dependencies within pipelines.
  • Data quality monitoring was shallow. Reliability checks focused on pipeline task freshness, missing broader quality issues.
  • Tooling was disconnected. Airflow couldn’t see into downstream tools like Looker or Events dashboard, resulting in incomplete visibility into data product health.

Solution

Recognizing these challenges, Miro enhanced their data stack with DataHub Cloud as the central metadata management platform.

The implementation aimed to create holistic, accessible metadata for data products, contracts, and expectations across the organization. By decoupling from a sole reliance on Airflow metadata, Miro enabled more flexible definitions for both foundational components and business metrics.

To promote collaboration, Miro relocated data product and contract definitions closer to the analytics domain within their dbt repository. This empowered analysts already familiar with the repo to actively contribute to product creation and quality standards by authoring YAML files aligned with the DataHub definition.

As a result, data products are now fully discoverable in the UI, complete with contract details and readable SLAs, ensuring transparency, accountability, and alignment across teams.

Impact

With DataHub Cloud at the core of their holistic metadata management strategy, Miro implemented analytics data products and improved data product reliability for technical and non-technical users alike. 

Key outcomes included:

  • Established a holistic metadata management strategy for technical and business users
  • Accelerated use of data by enabling business users to access complete data product documentation instead of navigating through technical documentation
  • Enhanced stakeholder confidence in data-driven decisions through transparent, comprehensive visibility across the entire data ecosystem and trust in data products
  • Improved cross-team collaboration between analytics teams and data consumers

Start your own success story with DataHub

Meet with us

See how DataHub Cloud can support enterprise needs and accelerate your journey toward context-rich, AI-ready data. Request a custom demo.

Join our open source community

Explore the project, contribute ideas, and connect with thousands of practitioners in the DataHub Slack community.