Today we released data-lineage, an open-source Python project to visualize and analyze data lineage. Developing this project required collaboration with data teams working on various data governance initiatives over the last couple of years.
There are a lot of open-source and commercial tools to capture data lineage. However, there are two main problems for data engineers: The projects require a lot of effort to get started and maintain. Requires constant discipline in capturing and sending all the metadata.
Both these factors result in incomplete projects and lost opportunities to improve performance, ROI, and data quality. data-lineage solves these problems by choosing the following goals:
The following features help to achieve these goals:
You can get a data lineage graph with less than ten lines of Python code in a Jupyter Notebook. Currently, data-lineage supports postgres, with support for additional databases on the way. Try it out if you require data lineage for your work, and provide us with your feedback!
Links:
Get in touch for bespoke support for PII Catcher
We can help discover, manage and secure sensitive data in your data warehouse.