Examples of Time Series Data

Understand time series through a few examples of real-world data.

We'll cover the following

Don’t worry if you’ve never used Python or pandas and don’t fully understand the code yet. All of that will be covered in the next sections. The goal here is to see some examples and start getting familiar with the way Python handles time series data.

Microsoft stock prices

Take a look at the first five rows of a dataset containing Microsoft stock prices and see how those prices change over time.

Press + to interact
main.py
microsoft_stock.csv
import pandas as pd
df = pd.read_csv('microsoft_stock.csv')
print(df.head())

Many questions pop up just from looking at these rows:

  • What is the format of the “Date” column? Does “4/1/2015” mean “April 1st” or “January 4th?” In the US it would be the former, while in most other countries it would be the latter.

  • Where is the data for “4/3/2015,” “4/4/2015,” and “4/5/2015?"

  • What is the most recent data available?

As you might have noticed, when looking at a time series in a table, sometimes it can be hard to see the big picture. The standard way to visualize time series data is by using line charts.

Press + to interact
main.py
microsoft_stock.csv
import pandas as pd
import matplotlib.pyplot as plt
df = pd.read_csv('microsoft_stock.csv')
# Changing the datatype
df["Date"] = pd.to_datetime(df['Date'], format='%m/%d/%Y %H:%M:%S')
# Setting the Date as index
df = df.set_index('Date')
# Plotting
fig, axe = plt.subplots(figsize=(7, 4.5), dpi=300)
axe.plot(df['Close'])
plt.xlabel("Date")
plt.ylabel("Price")
plt.title("Microsoft stock closing price")
fig.savefig("output/output.png")
plt.close(fig)

As you can see, the line chart gives us a much better overview of the stock prices' data. It goes from 2015 to 2021, and it shows that the stock price has been increasing over time, except for some bumps in 2019 and 2020.

Seattle weather

Now, let’s take a look at the first rows of a dataset containing weather data, such as the max temperature and humidity, for Seattle (US) and see how these numbers change over time.

Press + to interact
main.py
seattle_weather.csv
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.dates as dt
df = pd.read_csv('seattle_weather.csv')
# Changing the datatype
df["date"] = pd.to_datetime(df['date'], format='%Y-%m-%d')
# Setting the Date as index
df = df.set_index('date')
fig, ax = plt.subplots(figsize=(7, 4.5), dpi=300)
ax.plot(df['temp_max'])
# Formatting axe to make it easier to read
ax.xaxis.set_major_locator(dt.YearLocator())
ax.xaxis.set_minor_locator(dt.MonthLocator((1,4,7,10)))
ax.xaxis.set_major_formatter(dt.DateFormatter("\n%Y"))
ax.xaxis.set_minor_formatter(dt.DateFormatter("%b"))
plt.setp(ax.get_xticklabels(), rotation=0, ha="center")
plt.subplots_adjust(bottom=0.15)
# Labelling
plt.xlabel("Date")
plt.ylabel("Temperature (max)")
plt.title("Seattle daily max temperature")
fig.savefig("output/output.png")
plt.close(fig)

Here we can see there is more variability and a clear seasonal pattern. Every year, temperatures reach a peak around July/August and a low around December/January. On the other hand, there is no clear trend—the overall temperature doesn't seem to be getting significantly higher or lower every year.