Humans of DataHub:
Roko Gudic
Humans of DataHub
This week, we are joined byย Roko Gudic, Data Science Manager atย Infobip. Infobip is a global leader in omnichannel communication, and it’s their business to simplify how brands connect with, engage, and delight their customers at a global scale.
Roko shares his journey to DataHub, what DataHub has enabled within Infobip, his favorite features, and more. You donโt want to miss this conversation!
Humans of DataHub Interview with Roko
Conversation Transcript & Highlights
Edited for brevity & clarity
Maggie Hays: Welcome to another round of Humans of DataHub. Iโm Maggie, the Community Product Manager for DataHub. And today, we are joined by Roko. Roko, thanks for joining us; please tell us a little about yourself, who you are, where you work, and what youโre doing.
Roko Gudic: Itโs an absolute pleasure to be here. And Iโm glad to even get familiar with DataHub as a tool in the Community. So kudos for that. My name is Roko Gudic, and I work for Infobip. We are the first Croatian unicorn. We are a global communication platform, basically in a B2B market, providing our customers with a way to contact their customers on a number of channels, starting from SMS, Viber, WhatsApp, or whatever they choose. We are currently ranked as a leader in the seekers or contact center as a service space by Juniper Research.
At Infobip, I am a Data Science Manager leading a team of data scientists. So with all this data we have as a company, our task is to get value out of the data. As you can imagine, not a single company is perfect. We have dealt with the good and the bad side of having huge volumes of data, and thatโs actually how I got into data governance as an initiative in my company.
Iโm also part of the Data Governance Committee, a multi-functional group of data governance enthusiasts. This leads me to DataHub and how we actually got to DataHub; We were honestly first faced [with the challenge] when we were thinking of proprietary tools, looking at Informatica and solutions like that. At that point in time, we wanted to see what was on the other side, so what was being offered in the open-source community. And, of course, DataHub was one of the first tools I stumbled upon among others, such as Amundsen, and we spent a month validating different approaches. We opted for DataHub because it offered what we needed in the sense of flexibility, the Community around it, the quality of the solution, user experience, and so on. Thatโs how we got here.
Elizabeth Cohen: What do you enjoy most about the DataHub Community?
Roko Gudic: For starters, I mean, from the first moment I joined slack, everyone was really, really welcoming. The first message, I felt like, Okay, go there, say hi. I thought, Okay, Iโll just say hi, no one will respond. And Maggie, in just five minutes, responded. Someone is giving this thought and taking care of new arrivals. And from that point onward, I just felt like, I have people around who are willing to help. Itโs vibrant and welcoming. And, the quality of the solution of the product gathered people who either care for high-quality products or are in need for a high-quality product. I believe that [DataHub] is like a perfect storm, so to speak; like-minded people needing the same solutions and helping each other out on this journey. Itโs really wonderful.
Maggie Hays: Thatโs great. Iโm glad that you felt welcomed. I also know you go above and beyond to welcome other people and provide support. So weโre super appreciative of that. And I know, you know, we canโt do all of this alone. And especially as the Community grows, itโs just so wonderful to have folks like you willing to jump in, help people feel supported, help give direction, help brainstorm, so we appreciate you just as much.
Roko Gudic: Thank you. Thatโs precisely why I kind of felt, and it felt natural to help other people as well. Because I think the same, the same energy coming towards me. So I just wanted to return it. And thatโs the main idea behind the Community and open-source projects.
Maggie Hays:ย Weโre doing something special here. Iโm excited about it. So, thinking about DataHub within your organization, with you being in data science and more data governance-focused, what are you looking for DataHub to enable within your organization? Or what problems have you already addressed or looking to address by incorporating the tool?
Roko Gudic:ย The most obvious response would be pure data discoverability. In a company like Infobip, we are reaching around 7 billion mobile phones every month, we are working on a huge scale. And that brings its own set of problems. So, just from a pure data discoverability standpoint, we have already gained much. But the most important thing is having a platform to make business people and IT people work together better. In a sense that I believe that business glossary and everything around, you know, standardizing the naming standard definitions that need to come from the business side. And to map the physical representations of those concepts together for people to, for example, when they come to our company, they have one central place where they can go and search for concepts that they are learning about, and see how those data elements that are connected to the concepts are being used throughout the company, in dashboards and stuff like that. So it eases the onboarding, but also for people that have been present in the company for so long to be able to speak the same language about what we think means the same thing.
Maggie Hays: Sure. Thatโs awesome.
Elizabeth Cohen: With DataHub, what has been your favorite feature or use case?
Roko Gudic: Okay, so for me, it would probably be column-level lineage. That was something that I anticipated, the rest of the Community and me. Thatโs something that had the biggest, Iโd say, jump in value, so to speak. Also, as I mentioned, the business glossary and just, you know, the quality of the search functionality as well. We have, like, hundreds of thousands of entities in DataHub, so quality search mechanism is necessary. And we have that in DataHub. So yeah, but I would say column-level lineage number one.
Maggie Hays: Yeah, column-level lineage is one where weโve had so much excitement. And I think one of the things Iโm really excited to figure out as a Community is how, you know, when youโre in the scale of hundreds and thousands of datasets, and then you amplify that with column-level lineage, it can become exceptionally complex very quickly, right? And so, I think we have a really great Community around people wanting to solve this problem. How do we solve column-level lineage in an impactful way, and not just have like a pretty visualization to go along with it, but actually, empower people to act on it? So super excited about it.
Roko Gudic:ย Thatโs what brings me to another great, great aspect of DataHub that we just started to explore;ย DataHub Actions, and writing out workflows either for creating new glossary terms and having like a lifecycle for a term; so someone proposes it, then itโs assigned to someone who would need to approve it, and then have it approved and published for the rest of the company. Also, we are looking into data collection in the sense of impact analysis, of course, at the column level. And thatโs something that we are kind of really, really enthusiastic about. Thatโs where the biggest value will happen, actually.
Maggie Hays:ย Keep us posted on that, because thatโll be a fun story to tell the Community, because I think, you know, thereโs a lot that we can learn from how you approach it. And thatโs really exciting.
Maggie Hays:ย With your current implementation of DataHub, and it sounds like youโre going to be working on actions, sounds like youโre trying to automate some of those workflowsโฆ What are you excited to see from the DataHub Community beyond those features, additional features or ways to collaborate within the DataHub Community?
Roko Gudic:ย For us, itโs the new [ingestion] sources. We are looking to contribute the Qlik data source. We are ironing out some details, and I cannot wait to bring it back to the Community. Also, I want to see more ways to engage business stakeholders. And how I would approach it, or one of the ways I would approach it, is to suggest, and I already have plans for this to communicate to the Community, to include our concept of a business process. Because a lot of you know, data entities and data sources are being consumed in some higher-level business processes; those business processes have their owners, right? And itโs a good way to connect, or to bring this whole data perspective to the non-technical business user. The way I see it, the biggest value for the company actually happens when business and IT or data people start working together to improve the quality, to improve the whole governance. I believe business processes as a concept is something that could potentially tie things up a little bit more.
Maggie Hays:ย Thatโs really cool. So kind of like attaching the underlying data to the day-to-day, kind of human workflows or decisions that are being made, that maybe arenโt tied directly to a dashboard or to a model. But tied to more of a concept. Thatโs really awesome. Weโre hearing, or Iโm hearing, more and more rumblings that are interested in that. So maybe thatโs something that we can put out a Community RFC, right, like, how do we model a business process effectively within DataHub? Thatโs really cool.
Roko Gudic: Yep, it could be similar to how data flows and jobs. Some dissimilarities exist [that would require] additional development, but we could take a similar approach with modeling business standards and business processes.
Elizabeth Cohen: Roko, we have one last question for you. What advice would you give someone curious about the DataHub Community or just joining the DataHub Community? What advice would you give them as theyโre just starting?
Roko Gudic: I would say, you know, jump in 100%. Immerse yourself in the Community, try the tool, talk to people, ask questions, and hopefully also help other people. Really, just immerse yourself in the Community and experience.
Maggie Hays: We are a great bunch if I daresay so myself.
Roko Gudic: I would agree.
Maggie Hays: Well, Roko, thank you so much for taking the time with us. We are very appreciative of the contributions youโre giving back, and youโre helping build this Community with us.
If you are new to DataHub, just beginning to understand what โmetadataโ and โmodern data stackโ mean, or youโve just read these words for the first time (howdy, friends! ๐ค ), let us take a moment to introduce ourselves and share a little history;
DataHub is an extensible metadata platform, enabling data discovery, data observability, and federated governance to tame the complexity of increasingly diverse data ecosystems. Originally built at LinkedIn, DataHub was open-sourced under the Apache 2.0 License in 2020. It now has a thriving community with over 5.5k members (and growing!) and 315+ code contributors, and many companies are actively using DataHub in production.
We believe that data-driven organizations need a reimagined developer-friendly data catalog to tackle the diversity and scale of the modern data stack. Our goal is to provide the most reliable and trusted enterprise data graph to empower data teams with best-in-class search and discovery and enable continuous data quality based on DataOps practices. This allows central data teams to scale their effectiveness and companies to maximize the value they derive from data.
Want to learn more about DataHub and how to join our community? Join us on Slack. ๐
Recommended Next Reads

