image credit: Adobe Stock

Data Lineage Demystified

January 5, 2023

What Is Data Lineage?

Data lineage describes data origins, movements, characteristics, and quality. According to Stewart Bond, lineage typically describes where the big data begins and how it is changed to the final outcome. Technology projects have used this traditional approach to data lineage. For example, during the creation of a new clinician/patient system, at a large technology company, project members would refer to a map of tables and joins, to guide what SQL to use for selecting, summarizing, or grouping the data. Programmers would update the code to generate the needed values and QA would read these plans to anticipate ways to break the software. While this method was a start, data lineage needs an expanded definition.