INDUSTRY
SIZE
DATA STACK
SOLUTION
USE CASE
GOALS
- Visualize data flow and dependencies across complex multi-system architecture
- Enable self-service metadata annotation by data producers
- Track upstream and downstream impacts of dataset changes
- Decentralized metadata management through CI/CD workflows
The Topline
Challenge
Sparse documentation, outdated metadata, and limitations of their incumbent tool that lacked lineage support, programmatic access, and dbt integration
Solution
Implemented DataHub with custom workflows enabling decentralized, self-service metadata management for data producers
Impact
Achieved organization-wide adoption with 300+ users managing 23,000+ datasets with full lineage visibility and decentralized metadata management
Note: This story was originally published May 2024.
Challenge
Funding Circle, a lending platform that has helped over 140,000 small businesses secure loans, faced significant barriers to achieving self-service data capabilities.
The organization’s data ecosystem was growing rapidly, processing large volumes of data from loan applications and various internal and external systems. Their existing metadata management tool, TrueDAT, presented multiple limitations that hindered scalability:
- Lack of lineage support prevents comprehensive data flow analysis
- No programmatic access leading to centralized and unscalable metadata ingestion
- Limited compatibility with their diverse data sources and platforms
- No support for dbt, a key tool in their data transformation processes
These constraints made it increasingly difficult to maintain data visibility and governance across their complex ecosystem.
“It’s really important for us to visualize and understand how data has been flowing through different systems to be able to analyze any impact of the changes we make and also to understand upstream and downstream dependencies of a given dataset.”
— Harsha Mandadi, Senior Data Platform Engineer, Funding Circle
Solution
Funding Circle transitioned to DataHub to overcome these limitations and enable true self-service data capabilities. The implementation provided several key advantages that directly addressed their challenges:
- Column and table-level lineage for comprehensive data flow analysis
- Programmatic metadata ingestion enables decentralized and scalable data management
- Extensive source support for their diverse data platform ecosystem
For sources that have looser standards around capturing metadata (Postgres, Tableau, and Athena), the team developed a workflow for their users to configure asset-level metadata via YAML, incorporate it into their CI/CD workflows, and emit validated metadata to DataHub with a few easy steps:
- YAML configuration: Users create/modify YAML files to capture dataset metadata, including ownership, descriptions, and glossary terms
- CI/CD integration: Configuration details are added to Drone CI/CD pipelines with specified plugin names and manifest files
- Automated validation: Build promotion validates YAML structure and content, ensuring only valid metadata reaches DataHub
- Metadata surfacing: Enriched metadata with owners, tags, and terms becomes visible in DataHub
“DataHub provided us with column and table-level lineage support, multiple ways to programmatically ingest metadata, support for multiple data sources, and the ability to extend the metadata model to have custom platform information.”
— Harsha Mandadi, Senior Data Platform Engineer, Funding Circle
Impact
With DataHub, Funding Circle realized significant organizational and technical benefits across its organization.
Key outcomes included:
- Decentralized metadata management reducing bottlenecks and improving scalability
- Enhanced data discovery and understanding of data dependencies
- Complete lineage visibility at column and table levels
- Self-service metadata annotation enabling data producers to enrich assets directly in their workflows
- Extended platform support accommodating custom platforms alongside standard data sources
DataHub has been very successful and it has been well adopted within our organization.
HARSHA MANDADI
Senior Data Platform Engineer, Funding Circle
Start your own success story with DataHub
Meet with us
See how DataHub Cloud can support enterprise needs and accelerate your journey toward context-rich, AI-ready data. Request a custom demo.
Join our open source community
Explore the project, contribute ideas, and connect with thousands of practitioners in the DataHub Slack community.