Search⌘ K
AI Features

The Dataset and Exploratory Data Analysis

Explore the process of loading the Breast Cancer Wisconsin dataset and conducting exploratory data analysis. Learn to read feature names, calculate summary statistics, visualize distributions, and distinguish between benign and malignant tumor classes to prepare data for Support Vector Machine modeling.

We'll cover the following...

Welcome to the hands-on section. In this lesson, we’ll be working with a real dataset on Breast Cancer Wisconsin (Diagnostic). This dataset belongs to the UCI machine learning repository.

The data

At this stage, we should feel very comfortable writing functions, analyzing data, and training machine learning algorithms. We are given a data file (breast_cancer_data_no_feature_names.csv) and another file (features_names_breast_cancer.txt) that contains feature names for this project. This is a common practice when we ...