The Dataset and Exploratory Data Analysis

Proceed with the Breast Cancer Wisconsin dataset, and do exploratory data analysis.

We'll cover the following

Welcome to the hands-on section. In this lesson, we’ll be working with a real dataset on Breast Cancer Wisconsin (Diagnostic). This dataset belongs to the UCI machine learning repository.

The data

At this stage, we should feel very comfortable writing functions, analyzing data, and training machine learning algorithms. We are given a data file (breast_cancer_data_no_feature_names.csv) and another file (features_names_breast_cancer.txt) that contains feature names for this project. This is a common practice when we have many features in our data. We can write a custom function to read features from the given text file and the data from the CSV file and name data columns accordingly. First things first: let's import the required libraries.

Get hands-on with 1200+ tech skills courses.