Humans of DataHub: Arun Vasudevan

We are excited to share our third installment of Humans of DataHub. This week we are joined by Arun Vasudevan of Peloton, where he is a Staff Engineer, Data Platform.
How did you first learn about DataHub?
“When looking for a Data Discovery solution in my old job at Expedia Group in 2019, I came across the announcement of Datahub from LinkedIn Engineering and loved the stream driven approach and immediately started hacking with it.”
What do you enjoy most about the DataHub Community?
“The pace at which the community and the Acryl Data Team pushes new features/innovations. Also, the Community is super friendly and open in helping others.”
What has DataHub enabled within your organization?
“We are still in the early stages of deployment within Peloton but what I saw in ExpediaGroup was that it helped in discovering the data and MLModels across the organization for the first time in the companies history. It also evolved from an initial data discovery use case to governance and data quality use cases too.”
What are you most excited to see happen with DataHub in 2022?
“I am looking forward to more Data Governance related features such as the ability to request and approve permissions for the underlying datasets, the ability to identify and filter PII fields, etc.”
What’s your favorite DataHub feature/use case?
“In the past, I have always struggled with getting data ingested with the contribution of Metadata Ingestion features it has become definitely much easier to ingest data this has been one of my favorite features in Datahub.”
Thank you, Arun, for speaking with the team and for all of your contributions to the DataHub Community.
If you are new to DataHub, just beginning to understand what “metadata” and “modern data stack” mean, or you’ve just read these words for the first time (welcome aboard! 🚀), let us take a moment to introduce ourselves and share a little history;
DataHub is an extensible metadata platform, enabling data discovery, data observability, and federated governance to tame the complexity of increasingly diverse data ecosystems. Originally built at LinkedIn, DataHub was open-sourced under the Apache 2.0 License in 2020. It now has a thriving community with over 2.3k members and 100+ code contributors, and many companies are actively using DataHub in production.
We believe that data-driven organizations need a reimagined developer-friendly data catalog to tackle the diversity and scale of the modern data stack. Our goal is to provide the most reliable and trusted enterprise data graph to empower data teams with best-in-class search and discovery and enable continuous data quality based on DataOps practices. This allows central data teams to scale their effectiveness and companies to maximize the value they derive from data.
Want to learn more about DataHub and how to join our community? Visit https://datahubproject.io and say hello on Slack. 👋