Hurb Arrives at Their Destination: A Single Source of Truth Across a Growing Data Stack

“We want to use DataHub as a source of truth.”
CUSTOMER
Hurb
INDUSTRY
Technology
SIZE
1,000+ employees
SOLUTION
DataHub Core (OSS)
USE CASE
Discovery, Lineage, Governance, Quality
DATA STACK
Metabase, BigQuery, Airflow, Kubernetes, Anomalo, Docker
GOALS
Unify data discovery across multiple platforms, Eliminate metadata inconsistencies, Improve data visibility with lineage tracking from source to destination, Streamline impact analysis
Curious to see DataHub in action?
The Topline
  • Challenge: Struggled with data asset discovery, traceability, and metadata inconsistencies across their growing data platform
  • Solution: Implemented DataHub with custom Kubernetes deployment, Airflow orchestration, and integrations across their entire data stack
  • Impact: Established a single source of truth with end-to-end visibility and automated lineage tracking, improving discovery, quality, and governance

Note: This story was originally published February 2023.

Challenge

Hurb, a Brazilian online travel platform, built its business on a solid data-driven culture. 

However, rapid growth created significant data discovery and governance challenges:

  1. Fast-growing data assets: Data became difficult to manage as new technologies and integrations were added
  2. Resource cataloging and data discovery issues: Teams struggled to locate and understand the purpose of existing data assets across their growing infrastructure
  3. Traceability of data origin: Critical for strategic decision-making to understand how data was transformed and loaded
  4. Building a single source of truth: Cataloged assets separately in their primary services (Metabase and BigQuery) caused metadata inconsistencies.

“What’s the value of having many data assets if you cannot find them, or discover their purpose?”

Patrick BrazData Engineer, Hurb

Solution

Hurb chose DataHub after creating comprehensive project requirements documentation and evaluating data catalog tools. Their decision was driven by four key factors:

  1. User-friendly interface supporting their self-service culture
  2. Active and receptive community for implementation support
  3. Contribution opportunity aligning with their open source culture
  4. Built-in ingestion sources for their primary services (Metabase and BigQuery)

Hurb deployed DataHub on Kubernetes using custom flattened charts and made the strategic decision to disable frontend ingestion, positioning DataHub as their single source of truth with all ingestion controlled through backend processes.

The implementation centers on Airflow orchestration, where all DataHub ingestion is managed through Kubernetes Pod Operator with a custom DAG factory. Hurb built a custom Airflow integration using dataset objects that allows data engineers to enrich metadata during DAG development and automatically build lineage. 

They integrated their data quality platform, Anomalo, and actively use DataHub’s impact analysis feature to identify who is affected by data changes or quality issues.

Impact

With DataHub, Hurb evolved from fragmented data management across multiple disconnected services to a unified metadata management platform that delivers end-to-end visibility.

Key outcomes include:

  • Established a single source of truth across a growing data stack
  • Eliminated metadata inconsistencies through centralized cataloging
  • End-to-end data visibility and quality control from source to destination
  • Automated lineage building with lineage backend
  • Proactive impact analysis capabilities to identify who is affected by data changes or quality issues
Curious to see DataHub in action?
DataHub transforms enterprise metadata management with AI-powered discovery, intelligent observability, and automated governance.
Get a personalized demo
Work directly with a DataHub engineer to evaluate fit for your architecture, walk through technical integrations, and explore pricing and deployment options tailored to your use case.
Schedule a Personal Demo
Join our open source community
Explore the project, contribute ideas, and connect with thousands of practitioners
Join the Slack community