Series Introduction

A Series is used to model one-dimensional data. The Series object also has a few more bits of data, including an index and a name. A common idea in pandas is the notion of an axis. Because a series is one-dimensional, it has a single axis—the index.

Below is a table of counts of songs several artists composed. We’ll use this to explore the series:

Press + to interact

When the interpreter prints our Series, pandas makes the best effort to format it for the current terminal size. The series is one-dimensional. However, it looks like it’s two-dimensional. The leftmost column is the index. The index is not part of the values. The generic name for an index is an axis, and the values of the index—0, 1, 2, 3—are called axis labels. The data—145, 142, 38, and 13—are also called the values of the series. The two-dimensional structure in pandas—DataFrame—has two axes, one for the rows and another for the columns.

The rightmost column in the output contains the values of the series—145, 142, 38, and 13. In this case, they’re integers (the console representation says dtype: int64, in which dtype means data type and int64 means 64-bit integer), but in general, the values of a Series can hold strings, floats, booleans, or arbitrary Python objects.

To get the best speed (and to leverage vectorized operations), the values should be of the same type, though this is not required. It’s easy to inspect the index of a Series (or DataFrame), since it’s an attribute of the object:

Press + to interact

In the above case, the dtype—data type—of the Series is the object (meaning a Python object). This can both be good or bad.

The object data type is also used for a series with string values. In addition, it’s also used for values that have heterogeneous or mixed types. If we have only numeric data in a Series, we wouldn’t want it stored as a Python object but rather as an int64 or float64, which allows us to do vectorized numeric operations.

If we have time data and it says that it has the object type, we probably have strings for the dates. Using strings instead of date types is bad because we don’t get the date operations that we would get if the type were datetime64[ns]. A series with string data, on the other hand, has the object type. Don’t worry; we’ll see how to convert types later in the course.

Introduction

Series Deep Dive

DataFrames

Manipulating Data

Wrapping Up

Appendix

Counts of songs artists composed

Data representation in Python

The index abstraction

The pandas Series

Artist	Data
0	145
1	142
2	38
3	13