Data Lineage is an open source application to query and visualize data lineage in databases, data warehouses and data lakes in AWS and GCP.
- Generate lineage from SQL query history.
- Supports ANSI SQL queries
- Integrate with Jupyter Notebook
- Visualize data lineage using Plotly.
- Select source or target table.
- Pan, Zoom, Select graph
Checkout an example data lineage notebook.
Data Lineage enables the following use cases:
- Business Rules Verification
- Change Impact Analysis
- Data Quality Verification
Check out the post on using data lineage for cost control for an example of how data lineage can be used in production.
- AWS Athena
- AWS Redshift