HHow to Get Started on Data Governance
In this guest post, Syed Atif Akhtar provides insights on how an organization can get started on Data Governance. These insights are based on his experience helping organizations big & small to put…
In this guest post, Syed Atif Akhtar provides insights on how an organization can get started on Data Governance. These insights are based on his experience helping organizations big & small to put…
Open source Python project data-lineage now supports column level data lineage. Column lineage enables fine-grained data governance projects for all stake holders. Data Stewards can verify…
Metadata in a data lake is important for the productivity of everyone in the data ecosystem. The different types of metadata, systems to store them, and their consumers can be very confusing. How is a…
What is Data Governance? The first step is to understand what is data governance. Data Governance is an overloaded term and means different things to different people. It has been helpful to define…
This blog will describe how to generate data lineage using the data-lineage python package from query history in Snowflake. data-lineage generates DAG from parsing SQL statements in query history…
Today we released an open source Python project data-lineage to visualize and analyze data lineage. The project was developed in collaboration with data teams on data governance initiatives over the…
What is meant by data lineage ? In Biology, lineage is a sequence of species each of which is considered to have evolved from its predecessor. Similarly, Data Lineage is a sequence of transformations…
What is database auditing ? Database Audits track two important activities in databases: User Login Database Object Access User Login In postmortems, it is important to get a list of humans (vs micro…
Why do you need a data catalog ? A data catalog is important to solve two problems in modern data teams: Avoid poor productivity of people and the ROI of data. Governance Risk Productivity Analysts…
GDPR or General Data Protection Regulation is a EU law on data protection and privacy. More recently, California passed its own privacy law, CCPA or the California Consumer Privacy Act. The CCPA…
AWS and GCP best practices suggest running databases in a VPC and subnet. Applications run in a separate subnet and are exposed to the outside network through load balancers. There are many options…
Veris Community compiles information security breaches and incidents based on a standard. They also maintain database of incidents - VCDB regularly updated by the community. Caveat: The community…
AWS Lake Formation permissions control access to data sets in your data lake in AWS at a table and column level granularity. For a quick primer, read Lake Permissions by Example blog post. Once…
AWS Lake Formations helps to setup a secure data lake on AWS S3. One of the main goals of the product is Simplified Security Management . The central tenet to this goal is to define security…
AWS Lake Formation helps to build a secure data lake on data in AWS S3. This blog will help you get started by describing the steps to setup a basic data lake with S3, Glue, Lake Formation and Athena…
Data Lineage tracks data transformation through all systems. It is important for data governance and security. In Data warehouses and data lakes, a team of data engineers maintain a canonical set of…