INDUSTRY
SIZE
DATA STACK
SOLUTION
USE CASE
GOALS
- Create extensible cross-source metadata catalog
- Support custom entity types and ownership models
- Reduce connector maintenance burden on platform team
- Define custom metadata properties for governance compliance
The Topline
Challenge
Needed to evolve internal data catalog that placed too much burden on the central platform team as data needs expanded
Solution
Partnered with DataHub for its extensibility, adding custom entities, ownership models, and properties
Impact
Enabled self-serve metadata, stronger governance, and reduced reliance on central team
Note: This story was originally published January 2024.
Challenge
Netflix’s internal cataloging tool, Metacat, helped federate metadata within its Big Data Warehouse layer, but it was limited in scope. As Netflix’s data needs evolved, so did the requirements for a more comprehensive, cross-layer catalog spanning online stores, real-time pipelines, and analytics systems.
Two core issues emerged:
- The central Data Platform Team bore the burden of maintaining connectors, instead of the data-owning teams
- There was no policy engine in place to enforce governance policies centrally
“There was a need to evolve the product to become a self-serve platform to enable the relevant source system teams to define the asset or entity types, and start ingesting the data into the catalog.”
— Ajoy Majumdar, Senior Staff Engineer, Netflix
Netflix required a solution that would support custom entity types, complex ownership structures, and privacy-driven custom properties aligned with regulatory requirements.
DataHub gave us the extensibility features we needed to define new entity types easily and augment existing ones. During our evaluation, we assessed both functional and nonfunctional aspects, and DataHub performed exceptionally well in managing our traffic load and data volume.
AJOY MAJUMDAR
Senior Staff Engineer, Netflix
Solution
Netflix selected DataHub after evaluating multiple metadata platforms. Their goal: find a solution that functioned not just as a data catalog, but as an extensible data platform.
DataHub’s extensibility was at the core of its appeal, supported by its robust scalability and feature set, developer experience, and community support.
Partnering with DataHub, Netflix began working on addressing its three foundational data catalog needs:
- Scope for new entity types: The unique nature of Netflix’s data ecosystem called for the creation of new entity types, unique to Netflix. For instance, a custom asset type to accommodate GraphQL schemas
- Custom ownership model: The evolving ownership models within Netflix’s datasets needed the creation of a custom ownership framework, enabling finer granularity and enhanced insights into data ownership
- Custom properties: To ensure alignment with privacy and legal standards, Netflix needed to define custom properties within the catalog. These properties, defined by Netflix’s privacy and legal teams based on specific glossaries relevant to their regulatory obligations, serve as guidelines for the terms under which data should be ingested into Netflix’s systems
“DataHub gave us the extensibility features we needed… and performed exceptionally well in managing our traffic load and data volume.”
— Ajoy Majumdar, Senior Staff Engineer, Netflix
Impact
Netflix leverages DataHub’s extensibility to support its bespoke data ecosystem and governance needs.
Key outcomes included:
- Improved productivity of central data team by enabling self-serve data cataloging across teams and allowing source system owners to define and onboard metadata directly
- Met the unique needs of Netflix through support for custom entity types, ownership types, and properties
- Strengthened governance by defining custom metadata properties aligned with privacy and legal standards
- Offloaded connector development, reducing reliance on the central Data Platform Team
Start your own success story with DataHub
Meet with us
See how DataHub Cloud can support enterprise needs and accelerate your journey toward context-rich, AI-ready data. Request a custom demo.
Join our open source community
Explore the project, contribute ideas, and connect with thousands of practitioners in the DataHub Slack community.