Introduction to Data Cleaning

Explore the data cleaning stage in the Data Science Lifecycle. Understand why cleaning data is essential to avoid errors from incomplete or inconsistent data. Learn about handling missing values, duplicates, outliers, and converting data formats to prepare for effective analysis.

We'll cover the following...

- Why clean data?
- Cleaning data

Why clean data?

The data that we receive and use is not perfect. Numerous factors such as data collection from multiple sources, or data corruption while storing or retrieving data, human errors in entering data, data loss while transferring data on some network, etc, can lead to incomplete, inconsistent, and incorrect data. If we use data as received in our analysis, then we will perform incorrect analysis and any conclusion drawn from the data will be wrong. Therefore, data cleaning is a necessary step before doing any analysis on the data.

1.What is Data Science

2.Python Basics

3.Handling Tabular Data in Python

4.Data Cleaning

5.Exploratory Data Analysis

6.Statistical Inference

7.Predictive Models

8.Machine Learning

Project

Mock Interview

Introduction to Data Cleaning

Why clean data?

Cleaning data