Humans of DataHub: Mohamad Amzar

Humans of DataHub logo
We are excited to share our next installment of Humans of DataHub, featuring Mohamad Amzar! Mohamad is an Analytic Engineer at CDX, Bank Islam. At CDX, the team re-evaluates business models to prepare for the future in order to create better efficiency, transparency and provide better customer experiences.

Mohamad Amzar, also known as Amz on DataHub Slack.
How did you first learn about DataHub?
[DataHub] was my first task when I joined the company. We needed to have a data catalog solution. Our Data Team Lead, Hafidz, shared DataHub with me. The platform is already developed using Kubernetes, but does not fully utilize the functionality. So, I explored the documentation (DataHub has complete docs that are easy to follow). I began deploying DataHub in my local and attempting to make my first Postgres metadata ingestion to DataHub, I enjoyed using both ways to interact with the DataHub GMS, using CLI and GUI interfaces. It gave me many options to apply the ingestion, transform, and others.
What do you enjoy most about the DataHub Community?
Super-duper active members, especially from the DataHub Core team. They try not to let any questions go unanswered. Even with repeating questions, Community Members try to explain to the question asker and help solve the problem or share the existing thread link to make sure the person gets the support they need. And, we can see the number of the audience in every Town Hall rising each month. So, I can see every user enjoys and loves being a part of the DataHub Community.
What has DataHub enabled within your organization?
DataHub really helps us explain our data to newcomers. It won’t take much time for the person to discover the data. We have multiple data sources with various pipelines; hence it is hard to explain without having better visual documentation. Also, we [address] data governance and data quality [issues] by using DataHub.
What are you most excited to see happen with DataHub?
I am excited to see more interaction in the DataHub interface so that many things can be done in the interface rather than CLI. Besides that, I want to see many data sources supported by all functions (ex: Postgres for Lineage, etc).
What’s your favorite DataHub feature/use case?
Data Lineage. It is very interesting to understand the visualization of flows from data sources to utilization. This includes where the data is transformed, stored, and used during the process.
What is your favorite DataHub slack channel and why?
My favorite channel is #introduce-yourself. It helps me get connected with more people, especially within the same field as mine.
What advice would you give to someone who is just joining the DataHub Community?
Don’t wait, ask the question when you are stuck or need help!
If you are new to DataHub, just beginning to understand what “metadata” and “modern data stack” mean, or you’ve just read these words for the first time (welcome aboard! 🚀), let us take a moment to introduce ourselves and share a little history;
DataHub is an extensible metadata platform, enabling data discovery, data observability, and federated governance to tame the complexity of increasingly diverse data ecosystems. Originally built at LinkedIn, DataHub was open-sourced under the Apache 2.0 License in 2020. It now has a thriving community with over 4.7k members and 280+ code contributors, and many companies are actively using DataHub in production.
We believe that data-driven organizations need a reimagined developer-friendly data catalog to tackle the diversity and scale of the modern data stack. Our goal is to provide the most reliable and trusted enterprise data graph to empower data teams with best-in-class search and discovery and enable continuous data quality based on DataOps practices. This allows central data teams to scale their effectiveness and companies to maximize the value they derive from data.
Want to learn more about DataHub and how to join our community? Visit https://datahubproject.io and say hello on Slack. 👋