Time Series Data Types

Explore the key concepts behind handling time series data types in pandas.

Introduction

A key part of data analytics involves working with data collected over time. A time series is a set of data points collected over time, and these data points can represent anything from stock prices and weather measurements to social media metrics and website traffic.

The main feature of a time series is that the data points are ordered chronologically, with each data point corresponding to a specific point in time. The ability to analyze time series data is important for a number of reasons:

  • Trend analysis: Time series analysis allows us to identify trends and patterns in the data that may not be apparent from a simple visualization of the raw data.

  • Forecasting: Time series analysis allows us to predict future behavior based on historical data. For example, we can use time series analysis to forecast demand for a particular product.

  • Seasonality: Many time series exhibit seasonal patterns, such as increased sales during the summer months. With this understanding, we can better allocate resources for future events.

  • Anomaly detection: Time series analysis can help us identify unusual or anomalous behaviors. For example, we can use time series analysis to identify unusual patterns in financial transactions.

The pandas library has an extensive range of capabilities and features to work with time series data, and it’s something we’ll explore in greater detail in the following lessons.

Time series data objects

Firstly, it’s important for us to get a refresher on the fundamental time series data types. There are four main time series objects in pandas, and each represents a time-related concept. They are Timestamp, Timedelta, Period, and DateOffset.

The core of these time series objects is built upon datetime64 and timedelta64 from NumPy. Beyond that, pandas has also consolidated many features from other libraries like scikit.timeseries.

Timestamp

The Timestamp object represents the most basic time series component. It represents the natural date time arrangement we’re familiar with (i.e., year, month, day, hour, minute, and second). It’s equivalent to the datetime.datetime object from the standard Python library and comes with time zone support.

The pandas data types associated with the Timestamp object are datetime64[ns] and datetime64[ns, tz], where the latter is time zone-aware.

Note: datetime64[ns] is a data type used to represent date-time values with nanosecond precision (ns).

Timestamped data is the most basic type of time series data and associates values with points in time. The following code shows how the Timestamp() function readily converts a string into a timestamp with a familiar date format:

Get hands-on with 1200+ tech skills courses.