The data and exploratory data analysis

The Titanic dataset is considered a first step towards classification in machine learning. The goal here is to predict if a passenger survived the sinking of the Titanic or not.
EDA of the data reveals that:
- The Cabin column is missing 77.1%, the Age column is missing 19.9%, and the Embarked column is missing 0.2% of its data.
- Among the deceased, most were male.
- The rate of survival was higher for the class-1 passengers.
- The S port was the busiest port for each class. We can expect more people to survive. However, the rate of survival was higher for port C.

Data preprocessing and preparation

Moving toward the model training and evaluation phase involves preprocessing, such as removing missing values, handling categorical features by creating dummies, ...

Course Introduction

Linear Regression

Regularization

Bias-Variance Trade-off

Categorical Features

Logistic Regression

Logistic Regression: Titanic Data

Sentiment Analysis Using Multinomial Logistic Regression

Multiclass Classification and Handling Imbalanced Classes

Project: Predicting Chronic Kidney Disease

K-Nearest Neighbors

Implementation of K-Nearest Neighbors

Logistic Regression vs. KNN

Decision Tree Learning

Implement the Decision Tree Classifier from Scratch

Bootstrapping and Confidence Interval

Support Vector Machine

Practice and Comparisons

What's Next?

Appendix

Summary

The data and exploratory data analysis

Data preprocessing and preparation