• Visualize data flow and dependencies across complex multi-system architecture
  • Enable self-service metadata annotation by data producers
  • Track upstream and downstream impacts of dataset changes
  • Decentralized metadata management through CI/CD workflows

The Topline

Challenge
Sparse documentation, outdated metadata, and limitations of their incumbent tool that lacked lineage support, programmatic access, and dbt integration

Solution
Implemented DataHub with custom workflows enabling decentralized, self-service metadata management for data producers

Impact
Achieved organization-wide adoption with 300+ users managing 23,000+ datasets with full lineage visibility and decentralized metadata management

Note: This story was originally published May 2024.

Challenge

Funding Circle, a lending platform that has helped over 140,000 small businesses secure loans, faced significant barriers to achieving self-service data capabilities.

The organization’s data ecosystem was growing rapidly, processing large volumes of data from loan applications and various internal and external systems. Their existing metadata management tool, TrueDAT, presented multiple limitations that hindered scalability:

  • Lack of lineage support prevents comprehensive data flow analysis
  • No programmatic access leading to centralized and unscalable metadata ingestion
  • Limited compatibility with their diverse data sources and platforms
  • No support for dbt, a key tool in their data transformation processes

These constraints made it increasingly difficult to maintain data visibility and governance across their complex ecosystem.

“It’s really important for us to visualize and understand how data has been flowing through different systems to be able to analyze any impact of the changes we make and also to understand upstream and downstream dependencies of a given dataset.”

 — Harsha Mandadi, Senior Data Platform Engineer, Funding Circle

Solution

Funding Circle transitioned to DataHub to overcome these limitations and enable true self-service data capabilities. The implementation provided several key advantages that directly addressed their challenges:

  • Column and table-level lineage for comprehensive data flow analysis
  • Programmatic metadata ingestion enables decentralized and scalable data management
  • Extensive source support for their diverse data platform ecosystem

For sources that have looser standards around capturing metadata (Postgres, Tableau, and Athena), the team developed a workflow for their users to configure asset-level metadata via YAML, incorporate it into their CI/CD workflows, and emit validated metadata to DataHub with a few easy steps:

  1. YAML configuration: Users create/modify YAML files to capture dataset metadata, including ownership, descriptions, and glossary terms
  2. CI/CD integration: Configuration details are added to Drone CI/CD pipelines with specified plugin names and manifest files
  3. Automated validation: Build promotion validates YAML structure and content, ensuring only valid metadata reaches DataHub
  4. Metadata surfacing: Enriched metadata with owners, tags, and terms becomes visible in DataHub

“DataHub provided us with column and table-level lineage support, multiple ways to programmatically ingest metadata, support for multiple data sources, and the ability to extend the metadata model to have custom platform information.”

 — Harsha Mandadi, Senior Data Platform Engineer, Funding Circle

Impact

With DataHub, Funding Circle realized significant organizational and technical benefits across its organization.

Key outcomes included:

  • Decentralized metadata management reducing bottlenecks and improving scalability
  • Enhanced data discovery and understanding of data dependencies
  • Complete lineage visibility at column and table levels 
  • Self-service metadata annotation enabling data producers to enrich assets directly in their workflows
  • Extended platform support accommodating custom platforms alongside standard data sources

DataHub has been very successful and it has been well adopted within our organization.

HARSHA MANDADI

Senior Data Platform Engineer, Funding Circle

Start your own success story with DataHub

Meet with us

See how DataHub Cloud can support enterprise needs and accelerate your journey toward context-rich, AI-ready data. Request a custom demo.

Join our open source community

Explore the project, contribute ideas, and connect with thousands of practitioners in the DataHub Slack community.