DataHub Town Hall Recap: August 2025 Highlights
At our latest DataHub Town Hall, we looked at how DataHub is being applied in practice—and where the platform is headed next.
From Demandbase’s story of scaling data discovery and governance with the DataHub Iceberg REST Catalog, to a community member’s hands-on experiments with the DataHub MCP Server, to new ways lineage is helping teams anticipate change, the session highlighted the growing role of DataHub as the foundation for trustworthy, AI-ready data systems.
In this recap, you’ll get a closer look at the customer stories, community highlights, product demos, and roadmap updates that shaped the session.
Inside Demandbase’s Migration to DataHub Iceberg REST Catalog
As data environments grow more complex, discovery and governance often suffer. At Demandbase, Senior Manager of Data Systems Ryan Nowacoski shared how his team is using DataHub to bring order to that complexity.
By building their Iceberg REST Catalog on DataHub, Demandbase created a consistent view of assets, ownership, and lineage. The result:
- Easier discovery across distributed systems
- Stronger collaboration between data producers and consumers
- A governance model that scales with the business
Watch Ryan highlight the benefits Demandbase realized by merging their business catalog and Iceberg REST technical catalog into one operational catalog in DataHub.
“It’s been so nice utilizing DataHub in this way because now we are in a place where when we make any change to any of our tables in our unified platform it is immediately available in DataHub.”
— Ryan Nowacoski, Senior Manager of Data Systems, Demandbase
DataHub Community Spotlight: Exploring the DataHub MCP Server
DataHub community champion Mike Burke gave us a look at his experiments with the DataHub Model Context Protocol (MCP) Server.
Mike first learned about the DataHub MCP Server during our June Town Hall. With no prior hands-on experience, he decided to dive in—relying on documentation, YouTube overviews, and conversations with peers in the DataHub Community to guide his early experiments.
Before long, Mike had the MCP Server running locally on his laptop, testing the end-to-end experience. Along the way, he explored how authentication tokens are handled, how access controls are respected, and what safeguards are needed before moving toward production.
The real promise, Mike explained, is how the DataHub MCP Server transforms metadata exploration from point-and-click to conversational. Instead of navigating lineage graphs, you can ask questions and get direct answers—making metadata more approachable for engineers and non-engineers alike.
Hear more from Mike about his experience with the DataHub MCP Server.
“When you load up DataHub with a lot of data … there’s lots to look through. And [with the DataHub MCP Server] instead of clicking and pointing, you can just have a conversation.”
— Mike Burke, DataHub Community Champion
Making sense of modern data lineage
As data becomes cheaper to store and faster to generate, teams face a new kind of complexity: growing webs of dependencies across multiple platforms. A single schema change can ripple through dashboards, pipelines, and workflows.
Founding Product Manager Maggie Hays showed how DataHub’s lineage capabilities give teams the complete picture:
- Broad and deep coverage: DataHub automatically extracts lineage from over half of its 70+ supported ingestion sources. Its proprietary parser achieves 99.5% accuracy, capturing detailed, column-level lineage and tracing even the most complex dependencies that other open source tools often miss.
- Proactive impact analysis: With complete lineage, you can see exactly which downstream assets (down to the column) will be affected by a change. This makes it possible to notify stakeholders early and prevent widespread issues from happening.
- “Shift left” with automated propagation: In DataHub Cloud, metadata added upstream (like documentation or PII tags) automatically propagates down the lineage graph. This saves time, ensures consistency, and is especially valuable for compliance efforts like GDPR.
- AI-Powered discovery: With the DataHub Slack AI Discovery Assistant (now in Public Beta in DataHub Cloud v0.3.13) you can ask natural language questions like “What breaks if I update the LTV logic?” and get lineage-driven answers instantly.
At its core, lineage in DataHub gives practitioners the confidence to anticipate ripple effects, resolve issues early, and build trust in data at scale.
Watch Maggie demonstrate how the DataHub lineage graph supports a data quality investigation—and how it paves the way for proactive impact analysis.
Maggie also demonstrated how a data quality investigation can unfold through a natural language conversation with the DataHub Slack AI Discovery Assistant—now available to DataHub Cloud customers—or via the DataHub MCP Server wherever your agents run. Check it out.
What’s next for DataHub: 2025 roadmap update
We also shared what’s in the works across DataHub’s four core pillars:
1. Discovery: Building context-aware data exploration
In 2025, our discovery initiatives center on three priorities: capturing human context to enrich data insights, enabling intelligent exploration, and expanding the scope and depth of our end-to-end lineage graph.
Key initiatives include:
- New data sources: Our HEX connector is already live, with connectors for RudderStack, Snowplow, and Azure Data Lake coming soon.
- Hierarchical lineage: View the lineage graph at different levels of hierarchy—data job, container, domain, data product, and platform.
- Metrics catalog: Register, associate, document, and discover key metrics in DataHub.
- DataHub MCP Server (shipped!): The fully managed DataHub MCP Server is available to DataHub Cloud customers as of v0.3.12+. The self-hosted DataHub MCP Server is available now to DataHub Core users.
- Slack AI Discovery Assistant (Public Beta): For DataHub Cloud customers to enable natural language discovery and exploration of data right where you work. Read about it in our DataHub Cloud v0.3.13 release blog.
- Customizable home pages (Private Beta): For DataHub Cloud customers, giving teams the ability to curate the DataHub experience. Guide users to essential assets while letting individuals focus their workspaces on what they use daily. Check it out in our DataHub Cloud v0.3.13 release blog.
Watch Maggie break down our key focuses for data discovery in 2025.
2. Governance: Building a central compliance command center
In 2025, our goal is to make DataHub a universal data registry for centralized compliance and policy enforcement.
Key initiatives include:
- Bidirectional syncing: DataHub Actions to propagate tags and glossary terms back to source data platforms.
- Logical datasets: Manage metadata for multiple physical assets in one place.
- Access request workflows (Private Beta): In DataHub Cloud, reduce data access wait times from days to minutes with configurable approval flows and seamless integrations. Learn more in the DataHub Cloud v0.3.13 release blog.
Watch Maggie explain our key focuses for data governance in 2025.
3. Observability: Building contextual quality insights for every user
In 2025, our focus is on making data observability more accessible, collaborative, and contextual.
Recent updates include:
- Python SDK improvements (Public Beta): Enhanced support for creating and managing assertions programmatically, making it easier for teams working with a high volume of checks to keep pace as their ecosystems evolve.
- Bulk column smart assertions (Public Beta): DataHub Cloud now supports AI-powered checks across multiple columns with just one click, achieving data quality at scale.
Check out both updates in the DataHub Cloud v0.3.13 release blog.
Watch Maggie share the latest updates on our observability roadmap.
4. Metadata graph: Building automation-ready infrastructure with robust monitoring
The metadata graph is the foundation that powers discovery, governance, and observability. In 2025, our focus is on making this core platform more robust, reliable, and automation-ready.
Key initiatives include:
- Quickstart improvements (shipped!): Improved stability and performance. Reduced build time and system resources.
- Service accounts: Coming later this year, create and manage service users for programmatic workflows and custom automations.
- Validation of ingestion run outcomes (shipped!): Diagnose metadata coverage graphs with improved visibility into configurations and outcomes.
Watch Maggie walk through the key focuses for our platform in 2025.
Your turn to shape what’s next
This Town Hall underscored what makes DataHub unique: it’s not just a platform, but a shared effort to make data ecosystems more reliable and AI more trustworthy.
Teams are already pushing the boundaries of what’s possible with DataHub: building internal apps, crafting new UIs, and deploying in production. Your story could be next—and we can’t wait to hear it.
There’s more where that came from
Watch the full August Town Hall
Catch the full session and demos on demand.
Explore the DataHub MCP Server
Learn how it works and discover practical use cases your team can put into practice in our deep-dive article.
Join our open source community
Join the 13,000+ DataHub Community, a space for open source contributors, data practitioners, and anyone looking to shape the future of metadata-powered AI.