Metadata Day Round-Up: 4 Principles That Are Driving Modern Data Governance
In May this year, we at Acryl Data teamed up with LinkedIn once again to host the third edition of Metadata Day. A massive shoutout to our participants, expert panelists, and the 4000+ DataHub community for making this happen!
A quick flashback before I go ahead. In 2020, when we launched Metadata Day, we went all in to focus on the Importance of Metadata. 2021 was all about metadata and the data mesh.
In 2022, we zoomed in — and out — to talk about Metadata: Governance as Code, with a distinguished expert panel from the world of academia, research, and the industry.

I’ll be sharing my key takeaways from this insight and idea-packed session in a series of posts, starting with this one.
But before we dive in, I want to take a minute to set some context and explore the theme of the event.
Why is Data Governance as Code important?
As the workshop kicked off, we realized that there is maybe a more fundamental question that needs to be answered first:
Why are we still talking about data governance?
Data Governance as a term has been around for pretty much as long as Data itself, but we as an industry still seem to grapple with defining it well, much less solving for it. It’s not for lack of trying, there are a couple of reasons that make it an inherently complicated problem.
- One, it is broad: it includes data quality, privacy, access, and lifecycle management.
- Two, it’s complicated: there are multiple personas, processes, and business use cases to cater to.
The flow of data into, within, and out of today’s organizations is a tsunami breaking through rigid data governance methods. Yet our programs still rely on that command and control approach.
Achieving deep insights from data can’t happen without good governance practices. All indicators point to the need to create a resilient and responsive data governance function.
This snippet from Laura Madsen’s Disrupting Data Governance: A Call to Action, perfectly describes why data governance continues to be a talking point within the data community, and businesses at large.
Is Data Governance as Code the solution?
The last few years have seen an increased interest in applying code-first principles to data governance and its implementation. While this is exciting, there’s a lot that needs to be done to make this a reality.
Metadata Day presented the perfect opportunity to do just that — by diving deeper into data governance to
- distill the high-level ideas of governance-as-code into ideas that can work in small and large companies
- discuss their practical implementation of these ideas, and
… most importantly, build and nurture a community of practitioners committed to data governance.
Now that we’re on the same page about the theme of this year’s Metadata Day, it’s time to dive into the key takeaways from the panel session.
While there’s a lot that I took away from my discussion with the expert panel, in this post, I want to focus on four fundamental approaches — in some cases ‘shifts’ — that emerged as the foundation of a robust approach to data governance.
The 4 Foundational Principles of Effective Data Governance
1. Data governance ( ̶s̶h̶o̶u̶l̶d̶ ̶b̶e̶) IS a business priority

Source: Unsplash
Teresa, who heads the incubation and scaling of breakthrough cloud technologies at Accenture, while talking about how critical data is becoming to business, shared how
- Even companies that aren’t traditionally data-focused are looking at AI and data more than ever.
- 50% of earnings calls of Fortune 500 companies have CEOs reporting on AI and data
Companies need governance to make decisions on trusted business data, and the most effective approach to governance is one that focuses on business users and their needs.
In my conversations with hundreds of data engineers, data architects, and data leaders in the DataHub community, there’s something I’ve noticed time and again.
Data teams that are highly successful within their company are those that are building and adopting technologies not just to improve their own lives, but collaborating closely with Sales, Marketing, and other business stakeholders — to build solutions that integrate governance into their critical business workflows.
2. A mindset shift: Towards usage and usability, not just protection
For too long, data governance has almost exclusively focused on protecting data. This needs to change. We need to look at data governance as a way to
- ensure usability, and
- drive usage
Or as, our panelist Juan, Principal Scientist at data.world, said (quoting Mark Kitson) — data governance is like a car brake — its role is not to slow you down, but to enable you to drive fast, safely.
This discussion reminded me of my time leading GDPR efforts at LinkedIn, and a Strata 2017 talk called “Taming the ever-evolving Compliance Beast” that emerged from it. My colleague and head of Global Privacy at LinkedIn at the time, Kalinda Raina had this great quote that I used in my talk to illustrate the paradox that exists between data democracy and data privacy.

But as I and many other practitioners have experienced firsthand, you can solve for both if you build upon the right foundation of metadata.
This starts with first understanding the needs of all the stakeholders involved — not just the data producers, but also the business users and data consumers.
3. Data governance should be about making decision-making scalable and easier
We’ve underscored the importance of enabling usability for a successful data governance practice, but prioritizing usability depends on understanding what usability looks like in action.
Ultimately, good data governance, as Nishant, Director of Privacy Engineering, Architecture, and Analytics at Uber, says, is about building discipline around data — how and what to collect, where to keep it, and whom to use it for — so we can make better decisions at scale. What does this look like?
Bringing in automation wherever appropriate to make way for deterministic decision-making — eliminating the need for engineers to make decisions on a case-by-case basis.
An anecdote that Nishant shared, which paralleled efforts with what we had done at LinkedIn, was setting defaults that removed access to datasets for engineers if they had not accessed them for some time — while retaining access to datasets that they seemed to be frequently using.
We paired this with an easy automated way to re-request access to data for a limited window of time with business justification. This is a simple but very effective way to use operational metadata (audit logs) to trim down the surface area of access automatically — without needing a “stop the world” initiative or long meetings with stakeholders to align on a “data access management” program.
Nishant on the practical way to look at data governance and its objectives [10:30–11:32]
This requires baking in governance and identity signals (via metadata) into the data itself to give data the identity it needs — so it can be used effectively and correctly — without adding to the risk of the business.
4. A practical shift: Towards federated computational governance

Source: Unsplash
It’s no secret that most companies, in their pursuit of process-light approaches, have so far rewarded distribution and loose coupling.
You can see this in the decentralized software engineering stack built upon versioned source-controlled artifacts, in APIs and tests that exist as contracts between creators of libraries/services and their consumers, CI/CD pipelines, and operational monitoring that ensure that artifacts continue to meet their commitments as they evolve; characteristics we take for granted as part of the modern engineering culture.
In contrast, data governance has stayed centralized — partly due to it inherently being a centralized concern, and partly because of the lack of data tooling to bring modern engineering practices to data governance.
It is clear that data governance approaches need to shift — simply because what’s worked so far, may not work in the future.
We need to move towards a federated approach to data governance — one that applies software engineering discipline towards the effective enforcement of centralized concerns (policies).
Watch This Space for More
I’ll be sharing more of our learnings and takeaways from my discussion with the expert panel.
Stay tuned for the rest of the posts in this series.
In the meantime, check out the earlier editions of Metadata Day
- Metadata Day 2021: Metadata Meets the Data Mesh
- Metadata Day 2020: The Importance of Metadata