
DataHub 1.0 Is Here
We are incredibly excited to announce DataHub 1.0, marking five years of open source excellence in metadata management and ushering in the next generation of data catalog and AI management.
The Journey to DataHub 1.0
When we launched DataHub in 2020, we had a vision to revolutionize how organizations discover, manage, and understand their data. What started as a concept has grown into a vibrant, global community with 12,500+ Slack members, 10,000+ GitHub stars, and 6,300+ deployments.
This journey has been powered by 593 contributors, including teams from Visa, Netflix, Pinterest, Etsy, Foursquare, Grab, and many other leading organizations. The journey to 1.0 has been marked by continuous iteration, meaningful community feedback, and a relentless focus on solving real-world data management challenges.
Through each release, we’ve refined our understanding of what organizations need from a modern data catalog, leading to the comprehensive and powerful platform we’re proud to release today.
Three Pillars of Innovation in 1.0
1. Reimagined User Experience: Putting Usability First
The new DataHub experience represents a fundamental rethinking of how users interact with their metadata and datasets. Through extensive user research and community feedback, we identified that the biggest barriers to metadata adoption weren’t technical limitations but rather the friction in day-to-day use.
So, we’ve rebuilt the interface from the ground up. This redesign wasn’t just about making things look better — it was about making data discovery and understanding fundamentally more accessible to everyone in the organization, from data engineers to business analysts.
Intuitive Platform-Based Navigation
Hierarchically browse data by database and schema in Snowflake, BigQuery, Redshift, Databricks, and more. Combine hierarchical navigation with filtering by data owners, domain, tags, and glossary terms to find the right data fast.

Seamless Lineage Exploration
Our reimagined lineage view features multi-level expansion, name-based search, and column-level visibility, making it easier than ever to understand data relationships and impact.


Integrated Data Quality
Make confident decisions with deeply integrated quality signals throughout the platform, helping you quickly identify and trust reliable data assets.
2. Comprehensive AI Asset Support: Unifying Data and AI
As organizations rapidly scale their AI initiatives, the line between traditional data assets and AI assets continues to blur. We recognized that treating these as separate concerns was creating artificial barriers and increasing complexity for teams.
DataHub 1.0 breaks down these silos by treating AI assets as first-class citizens within the data ecosystem. A unified approach means teams can now track their entire data-to-AI pipeline in one place. The result is deep visibility into how data flows through AI systems, enabling better governance, faster debugging, and more confident deployment of AI models.
Unified Search and Discovery
Seamlessly search across models, model groups, and traditional data assets in one unified interface.

Advanced Versioning System
Track multiple versions of datasets and ML models with detailed performance metrics and clear linkages between versions.

Rich Model Statistics
Monitor key metrics across versions, understand performance trends, and make data-driven decisions about model deployment.

End-to-End Lineage
Trace data flows from raw inputs through models to final outputs, with complete versioning support.

3. DataHub Iceberg REST Catalog Beta: Simplifying Data Lake Management
Managing modern data lakes often means juggling multiple catalogs, tools, and interfaces, creating unnecessary complexity and potential security gaps. The beta release of our Iceberg Rest Catalog tackles this challenge head-on by integrating Apache Iceberg table management directly into DataHub. This integration represents a fundamental shift in how organizations can manage their data lake metadata.
By bringing Iceberg table management into the same platform where teams already handle their metadata, we’re enabling more consistent governance, better visibility, and simplified operations for data lake management.
- Real-Time Insights: Get immediate visibility into Iceberg tables alongside other data assets
- Unified Access Management: Control access policies through business metadata like domains, glossary terms, and tags
- Simplified Operations: Reduce complexity by eliminating the need for a separate catalog
Current features include full Iceberg v1.6 support, read/write capabilities, and CRUD operations on namespaces, with performance matching the native Iceberg suite.

Community Support and Getting Started
The launch of DataHub 1.0 marks a technical milestone and also a deepening of our commitment to community success. We understand that adopting or upgrading to a new version requires careful planning and support.
That’s why we’ve significantly expanded our community support infrastructure, creating comprehensive resources and extending our office hours to ensure every organization can successfully adopt DataHub 1.0, including:
- Extended weekly office hours to 60 minutes
- Comprehensive migration guides and documentation
- Dedicated issue templates for DataHub v1.0-rc
- Interactive demo available at demo.datahubproject.io
Looking Ahead: The Future of DataHub
While DataHub 1.0 represents a major milestone, we see it as just the beginning of a new chapter in our journey. The modern data stack continues to shift, with new technologies and paradigms emerging at breakneck speed.
Our vision for DataHub extends far beyond traditional metadata management — we’re building toward a future where DataHub serves as the central nervous system for data and AI operations. This means not just keeping pace with new technologies, but anticipating the needs of tomorrow’s data ecosystems and building the foundations for them today.
How to Get Involved
Whether you’re a long-time contributor or just discovering DataHub, there’s never been a better time to get involved. The DataHub 1.0 release candidate is available now, with the final release scheduled for February 28, 2025.
Join us in this celebration and be part of shaping the future of data discovery and AI asset management. Try DataHub 1.0 today and experience the next generation of metadata management.
New to DataHub? Get Started Here: