Blog > April 2020 > CDO Series: Intelligent Data Management - Realizing the Value from Lineage Advancements

CDO Series: Intelligent Data Management - Realizing the Value from Lineage Advancements

In my previous blog post, I showcased clients who are turning data into knowledge and knowledge into information. Their immediate data strategy focused on quick, powerful use cases that enable data attestation for privacy concerns.

In this blog post, I have asked my colleague, Yetkin Ozkucur, to write about what he’s currently finding in the technical realm of data management and lineage automation. Yetkin has been leading and creating best practices in this field for 20 years.

Five to six years ago, we were in education mode with customers regarding data lineage. Nowadays, everyone understands the value of data lineage, but they are worried about the level of effort and complexity it will take to get there. I understand their concerns. There are a lot of vendors claiming they can do automated lineage, and each vendor has a different approach, as well as a variation on the very definition of data lineage. There is a big difference between automated lineage and “somewhat” automated lineage. How automated the lineage is determines whether or not the entire lineage outcome is successful.

Over the years (decades), we tuned and perfected the lineage process, technology and approach. We are achieving highly impactful results as evident in the Gartner peer reviews, specifically on the metadata execution capability scores. I would like to highlight a couple of the key points around the importance of people, technology and process for a successful automated lineage project.

  1. Always tie lineage back to a business case. Companies do not do lineage for lineage sake. Whether its analytics, regulatory compliance, modernization or data governance, there is always a business driver behind it. Tying it back to the business driver helps to focus what is needed—i.e., is lineage required only on regulatory reports? Are you trying to replace Libor? Are you assessing data impact while moving to the cloud? Features like lineage snapshots or vertical (business to technical) lineage may need to be in scope.
  1. Setting the right expectation and delivering is key. There is no tool in the world which can deliver 100% automated lineage out of the gate. You must do the due diligence to be able to set the right level of expectation for automation and provide an easy to follow governed process to fill the gaps. The key is to achieve a successful outcome for the use case at hand, which will typically require more than just lineage features.
  1. Automation is the biggest cost factor for lineage projects. Here at ASG, we have been building scanners for more than two decades. These scanners have evolved beyond a metadata schema pull.  With the goal of less effort for more value, we have expanded what is pulled and how it is ingested, tagged and linked.  Over time, we have improved our scanner technology at every implementation, adapting to new marketplace tools and code, which has made it easier to troubleshoot gaps and issues.
  1. AI + LINEAGE = DATA INTELLIGENCE. The biggest leapfrog we just released is through our partnership with Fourth-IR, world-renowned for their AI, who saw the value of combining AI with lineage. We have just released version one of our Virtual Data Steward solution, which we estimate will decrease the repository build process by at least 50%, helping align business assets to lineage, tying lineage back to the business context. AI will also further automate the lineage troubleshooting process by finding gaps and automatically stitch them. 

If you understand the value of lineage but have been scared off by the time, effort, cost of implementation, I encourage you to take one more look at the lineage technology available today. We have many clients coming to us because their data governance efforts have been falling short of delivering real value when it comes to improving, finding and fixing long standing data issues. Full transparency to what’s happening to the data across technologies and business silos are changing the game for data centric organizations.

Posted: 4/23/2020 8:30:00 AM by Sue Laine