Search⌘ K
AI Features

Data Cleaning

Explore data cleaning methods crucial for preparing datasets before visualization in Python Altair. Understand strategies for managing missing values, removing duplicates, and manipulating data to create reliable and accurate visual stories.

Data cleaning is all about identifying and correcting inaccuracies and inconsistencies in data, which makes it more reliable and easier to work with.

Data cleaning involves the following main aspects:

  • Handling missing values

  • Managing duplicates

  • Manipulating data (formatting, normalization, and standardization)

Altair provides some functions to perform data cleaning. However, in most cases, it is better to clean the data before passing them to Altair, and use Altair only to render the visualization.

Handling missing values

A missing value is simply a value that is not present in the data. There are many reasons why values might be missing from the data, such as ...