Outlier Detection and Removal

This lesson will focus on how to detect outliers in the data and what to do with them.

Outlier detection

Detecting outliers is a very important step in data cleaning and exploring. It gives us an idea of the anomalies in the data which can give us valuable insights into the data. So, how can we detect outliers?

Outliers can be detected both visually and mathematically. Some plots are very helpful in visualizing outliers, such as box plots and scatter plots. However, it is sometimes tricky to decide whether or not to remove the outliers. We should remove outliers when we are certain that these outliers were results of some errors.

We will discuss some of the methods to detect and remove outliers. We will be using the Sample Sales Data. The data is in the file sales_data.csv.

Get hands-on with 1200+ tech skills courses.