ASG Perspectives

Blog > March 2021 > The Who, What, When, Where and Why of Data Lineage

The Who, What, When, Where and Why of Data Lineage

Since the pandemic and challenges of 2020 began, clients have been putting data and insights at the top of their data management priority list as they seek to uncover and act on the truth in their data. We at ASG Technologies have focused on the benefits of data lineage for some time now, and often demonstrate how data lineage enables faster and more reliable data. This is the ideal time to drill into the 5 W’s of all things data lineage in 2021.

The data lineage is instrumental in quickly and effectively validating that the insights drawn from data are accurate. Without lineage, clients are essentially seeking mere subject matter experts and laborious manual processes—or worse yet—acting on random data.

In this four-part blog series, as well as several ASG lineage webinar panels, I plan to break down some common myths and uncover how clients are achieving new levels of data monetization through data lineage, by asking:

  • Who?

    Who uses and delivers data lineage – stakeholders & data lineage vendors

  • What?

    The truth about what lineage is and how it is used

  • When/Where?

    When and where is the best time to use lineage

  • Why?

    Why bother? Is the effort worth the value?

 

THE “WHO”: Who Uses Lineage?

Valerie Logan, CEO and founder of the Data Lodge defines Data Literacy as “providing confidence in data at work and in your personal life.”  Valerie emphasizes, “in your personal life,” so data literacy doesn’t become a boring “work” thing.  It’s certainly true that we are surrounded by data whether we are – watching the news, ordering our food, driving our car etc. What if we had full transparency, down to the source of where fake news originates, at our fingertips? Or the source of how you are getting dogfood spam emails and ads when you have a cat? Maybe someday we will have lineage in our daily lives, but for now, let’s talk about who benefits from data lineage.

Valerie Logan created a framework called ISL, Information as a Second Language. Within this model, she breaks ISL down into three types of data users: business users, analytical users and IT.  This further disseminates into the VIA model for value, information and analysis which I will use in describing the “who” of Data Lineage.

Business Users

Three years ago, presenting lineage to a business audience was forbidden. This is no longer the case; in fact, savvy business users are extremely data literate. When lineage is presented from a business context it is a powerful Data Literacy enabler. Today, data is the secret sauce to any corporate initiative. The role of the business is to suggest and validate the pivotal steps and outcomes needed for the initiative to be successful. Without access to these data insights in a timely, relevant manner, the business user is handicapped. Without data, business decisions are inferred and backed by experience or inherent tendencies. Using Data Lineage, the business user will find data more quickly according to the context in which they are asking the question and validate that this data is correct more accurately. This will help them move from business context to data with confidence when deciding which fork in the road to choose.

The Data Analyst

The data analyst and data engineer are responsible for trusted insights, datasets and reports. It is their job to not only drive toward a deep understanding of the data but to align the data to the business outcome. If you truly want to gain 80% of the value from 20% of the data, the Analyst is your best bet in getting to a reliable data-centric culture. Instead of manually searching, validating and fixing (wrangling) corporate data, Data Lineage will automate and streamline their analysis. With the ability to search a critical data element, validate privacy rules, validate business context and know exactly where the dataset was pulled in less than five minutes, users can eliminate hours research, phone calls and project meetings.

IT

IT projects rely on change control and impact analysis to de-risk any production move. A large data project typically includes IT, Designers, Engineers and Analysts, all ensuring that the move will be successful and the data will be protected, deriving meaningful insights without breaking anything when deployed. Data Lineage will pinpoint the scope of the change from several aspects: If I change this process, what breaks down the line? How big is the scope of this change? How many systems and LOBs is this data element propagated to? And what business rules, processes and policies are associated with this change?

Transparency across technologies and business lines is not only cutting root cause analysis down from weeks to minutes, but it is tracing through code that maybe hasn’t been reviewed for a very long time. In some cases, lineage finds code and call outs to third-party data that was completely unknown.

Typically when a client is operational on lineage with five or six hops, they begin to see long-rooted issues that have never been exposed previously. Data fixes were applied on top of fixes, never re-visiting the past decisions with this data. With transparency across technologies and across business borders, clients begin to perform a “surgical analysis” on the data to fix longtime issues, and they begin to gain value from their technology investments once again. They find it wasn’t a technology issue, so much as it was a data transparency issue.

THE “WHO”: Who Delivers Lineage?

There are more Data Lineage vendors than ever before. In fact, lineage is more of a commodity now for Data Management tools. Being I am a vendor, I will not reveal my bias; instead I will offer up a list of considerations when choosing a vendor solution:

  • Understand if the lineage coverage applies to 70-100% of your technologies. If not, how will the lineage gaps be filled?
  • Beware of wild promises, such as, “we can create lineage in three seconds,” or inferred lineage using AI to pinpoint the precise path of how and where the data is flowing.
  • Is Data Lineage something that the vendor leads with up front or are they in the process of rolling it out?
  • Do they require another vendor to do the lineage for them? What is the cost and efficiency of this?
  • Does the lineage cross business borders and a nice diverse range of technologies? Does it offer a unified approach across the business to capture the lineage? If not, you will have pockets of lineage versus end-to-end transparency.
  • If it seems too good to be true, have them prove it with your data.
  • Finally, the biggest question of all: Does the lineage itself solve your business use case?

Good luck and stay tuned for the next W, “What,” where I will reveal some truths about lineage and how it is used.

Also join our upcoming lineage panel “Inside Data Lineage: Data Lineage Expert Panel, April 27th at 11am ET where ASG Technologies leaders will introduce you to the advancements of data lineage and the surprising array of business use cases for intelligent data lineage.

Register Now
Posted: 3/24/2021 8:00:00 AM by Susan Laine - Global DI Evangelist