Parsing SQL queries provides superpowers for monitoring data health. This post describes how to get started on parsing SQL for data observability. Query history of a data warehouse is a rich source of…
Data Observability Articles for Data Lakes
Data Observability is a set of processes to understand the health of data at an organization. Some factors that determine the health of the data are:
- Freshness: Tracks if data pipelines read the latest and most up-to-date version of the data.
- Field Statistics such as ratio of null values, min/max etc. should be in line with historical trends.
- Volume: No. of rows or data size in a dataset should be in line with historical trends.
- Lineage: Data Lineage enables providing context when triaging data health issues.
1 post tagged with "Data Observability" (See all categories)
Rajat Venkatesh — 10/01/2021 — 3 Min Read — In Data Catalog, Data Lineage, MySql, Snowflake, Data Observability