Introduction to Jupyter and pandas

Learn about Jupyter and pandas.

Now it’s time to take a first look at the data we will use in our case study. We won’t do anything in this lesson other than ensure that we can load the data into a Jupyter notebook correctly. Examining the data, and understanding the problem you will solve with it, will come later.

The data file is an Excel spreadsheet called default_of_credit_card_ clients__courseware_version_1_21_19.xls. We recommend you first open the spreadsheet in Excel or the spreadsheet program of your choice. Note the number of rows and columns. Look at some example values. This will help you know whether or not you have loaded it correctly in the Jupyter notebook.

Note: The dataset we will be using is a modified version of the original dataset, which has been sourced from the UCI Machine Learning Repository.

What is a Jupyter notebook?

Jupyter notebooks are interactive coding environments that allow for inline text and graphics. They are great tools for data scientists to communicate and preserve their results because both the methods (code) and the message (text and graphics) are integrated. You can think of the environment as a kind of web page where you can write and execute code. Jupyter notebooks can, in fact, be rendered as web pages.

Get hands-on with 1200+ tech skills courses.