Get Time Series Datasets
Explore multiple reputable sources to obtain time series datasets, including Kaggle, the UCI Machine Learning Repository, World Bank Open Data, and Google Trends. Learn how to register, search, and download CSV files suitable for analysis and forecasting in Python.
We'll cover the following...
Kaggle
Kaggle is one of the best sources of public datasets. It also hosts competitions where you can test your skills against other people, which is a great way to practice!
Getting set up
Click “Register” and create an account. (This step is needed to be able to download data.)
Search for “time series” in the "Search" bar. (You can also search for terms that interest you, and that may be time series data.)
Click “Datasets” to filter for datasets.
There will be a list of datasets for different topics—weather, stock prices, population, energy consumption, sales, etc. Choose one that sparks your interest and click on it.
Click “Download.”
UCI Machine Learning Repository
Despite its retro design, the UCI Machine Learning (ML) Repository is also a great source of datasets in general, and it contains around 50 time series datasets.
To find it, you can access this link to the repository, which will take you directly to a filtered view displaying only time series data.
You'll find data on air quality, social media, and health. Once you find something you like, click it and then click "Download Data Folder" to get a compressed file for your data.
World Bank Open Data
On the World Bank Open Data website, you can find lots of macroeconomic and microeconomic data for many countries. This site has historical data for indicators such as unemployment rates and CO2 emissions organized by category.
Note: This is a good source if you're interested in research or if you want to enrich your machine learning models with external data.
Google Trends
Google Trends is a good source to enrich your data with Google Search trends. It will give you an indication of the popularity of a given search term over time, and you can download the results as a .csv file to use in your analysis.
Conclusion
You've now seen useful sources for getting your time series datasets. As you might have noticed, almost all of them are in the .csv format, which is standard for most public datasets. This type of file can be opened for a quick inspection in any spreadsheet or note-taking software.
If you want to do a deeper analysis, you'll need to open these files in Python as you did in our previous examples.