Data Preprocessing: Missing Values

Learn data preprocessing and how you can fix missing values.

We'll cover the following

Data preparation and cleaning
Missing values

Cope with missing values

Data preparation and cleaning

Our data have different types. There are numerical data, such as “Age,” “SibSp,” “Parch,” and “Fare.” Then there are categorical data. Some of the categories are represented by numbers (“Survived,” “P-class”). Some are represented by text (“Sex” and “Embarked”). And finally, there is textual data (“Name,” “Ticket,” and “Cabin”).

This is quite a mess for data that we want to feed into a computer. Furthermore, when looking at train.info(), we can see that the counts vary for different columns. While we have 891 values for most columns, we only have 714 for “Age,” 204 for “Cabin,” and 889 for “Embarked”.

Before we can feed our data into any machine learning algorithm, we need to clean up. The following methods are used to preprocess the data:

Missing values
Identifiers
Handling text and categorical attributes
Feature scaling
Training and testing

Get hands-on with 1200+ tech skills courses.

Getting Started

Binary Classification

Qubit and Quantum States

Probabilistic Binary Classifier

Working with Qubits

Working with Multiple Qubits

Quantum Naïve Bayes

Quantum Computing Is Different

Quantum Bayesian Networks

Bayesian Inference

The World Is Not a Disk

Working with the Qubit Phase

Search for Relatives

Sampling

Conclusion

APPENDIX

Quantum Machine Learning in Python

Data Preprocessing: Missing Values

Data preparation and cleaning