...

Data Quality Measurement

Learn why data quality is important and how to measure it.

We'll cover the following...

What is bad data?
Data quality dimensions
- Business dimensions
  - Descriptive
  - User-driven
- Technical dimensions

Data engineering is all about delivering high-quality data to the right people at the right time. High-quality data is essential for making accurate and reliable decisions. Poor data quality can lead to poor business decisions, which can lead to lost revenue, decreased customer satisfaction, increased costs, and damaged reputation. So, what does high-quality data mean, why is it important, and how to evaluate and measure it?

What is bad data?

One effective approach to grasping the concept of quality data is to consider its opposite: what is bad data? Which types of data consistently get complaints from stakeholders? Let's look at a few real-life examples:

Data accuracy: For example, in the computation of net revenue, there's a risk of overlooking a specific cost type, resulting in an inaccurately calculated revenue figure.
Data freshness: Stakeholders engage in daily analyses relying on the previous day's figures. If they discover that the numbers haven't been updated, frustration ensues, causing a delay in their decision-making process.
Breaking schema changes: Data users encounter challenges when a column or table has been deleted or renamed, disrupting the functionality of their scripts.
Column description: The column names or descriptions are not descriptive enough for users to comprehend their meaning effectively. ...

Getting Started

Data Team Structure

Data Engineering Life Cycle

Cloud Data Architecture

Data Ingestion

Data Modeling

Data Orchestration

Mastering Airflow: Building an ETL Pipeline

Data Quality

Build an End-to-End Data Pipeline for Formula 1 Analysis

Epilogue

Appendix

Data Quality Measurement

What is bad data?