Python’s pandas library is used to handle regular tabular data, while geopandas library is an extension of pandas that provides functions for working with geospatial data, like maps and coordinates.
How to plot world population density using GeoPandas
Key takeaways:
GeoPandas simplifies spatial data processing by allowing easy calculations like area, boundary, and creating various plots such as choropleth and layered maps.
GeoPandas supports multiple data formats, such as JSON and SHP files, making it versatile for handling different types of spatial data.
Spatial data in GeoPandas is stored in GeoDataFrames, which represent geographical shapes like polygons, lines, and points for plotting locations and areas.
CRS (Coordinate Reference System) defines how the two-dimensional, flat map in GeoPandas relates to real places on Earth.The
to_crs()method is used to change the coordinate reference system, ensuring accurate area calculations when creating maps.GeoPandas’
plot()function allows you to visualize population density using color maps, boundary lines, and legends for better data representation.Customized visualizations can be easily created by combining GeoPandas with libraries like Matplotlib for enhanced map clarity and presentation.
The goal of GeoPandas is to make spatial data processing easier in Python. It provides high-level functions such as the calculation of area or boundary and basic, choropleth, layered, or interactive plots for multiple geometries and shapes. These capabilities are particularly valuable for visualizing population density, a key factor in urban planning, resource allocation, and understanding demographic patterns. For example, such visualizations help identify overpopulated areas needing infrastructure improvements or sparsely populated regions where resources may be underutilized. GeoPandas can read multiple data formats such as JSON or SHP files. It reads spatial data in the form of GeoSeries or GeoDataFrames representing complex polygons, linestrings, and points to plot geographical areas, paths, or locations.
GeoPandas also enables the projection of geographical data in different coordinate systems (CRS), which is an integral part of spatial data processing. To give you a flavor of plots in GeoPandas, here is a sample boundary plot created by boundary.plot() method of GeoPandas.
Creating a population density map
To create a world population density map, we will use a GeoJSON file containing global population data as a dataset. The dataset contains various attributes for each region, including:
NAME: The name of the country or region.ISO_3_CODEandISO_2_CODE: ISO country codes for identifying regions.AREA: The total area of the region.NAME_1: Subregion or additional name information.POP2005: Population estimates from the year 2005.REGION: The larger region to which the country belongs.GMI_CNTRY: Country classification based on Gross National Income (GNI).NAME_12: Another naming variant.geometry: Geospatial data representing the region’s boundaries.
It is pertinent to note that polygons can consist of hundreds of points and multiple polygons. A snapshot of geometry column is shown below.
Since the DataFrame already contains the population attribute, the area is to be calculated to find the population density of all the countries. To calculate the accurate area of countries, the projection of geometry is to be converted into EPSG:6933
The following code calculates and plots population density using GeoPandas.
# Import relevant librariesimport geopandas as gpdimport matplotlib.pyplot as plt# Read and process datasetworld_pop = gpd.read_file('https://raw.githubusercontent.com/MinnPost/simple-map-d3/master/example-data/world-population.geo.json')world_pop['POP2005']=world_pop['POP2005'].astype(float)world_pop['area']=world_pop.to_crs(6933).area.astype(float)*0.000001world_pop['density'] = (world_pop['POP2005'].div(world_pop['area']))world_pop.head()# Create population density mapplt.title('World Population Density Map')world_pop.plot(cmap='Blues',linewidth=0.2, scheme='quantiles',edgecolor='gray',column='density',legend=True,figsize=(10, 10),legend_kwds={"loc": "center left", "bbox_to_anchor": (1, 0.5)},)
Code explanation
Let’s understand the code above:
Lines 1–3: Import
geopandas,matplotlib.Lines 6–7: Read the data using
read_file()method of GeoPandas and finally convert thePOP2005column having world population in the year 2005 tofloatdata type.Line 8: Use
to_crs()andarea()methods of GeoPandas to calculate areas of countries after projecting the CRS. It also converts the area from m2 to km2.Lines 9–10: Calculate density by dividing the population by area, it also shows the first few rows of the DataFrame.
Line 13: Add title to the graph using
plt.title().Lines 14–17: Create the density plot using
plot()method of GeoPandas. It uses the following arguments:cmaprepresents a color map.linewidthsets the width of boundaries between countries.schemedivides the density attribute into different intervals.legendis set toTrueto include the legend in the figure.figsize()is needed to define the size of the figure.legend_kwdsis used for defining the location and size of the legend.
Note: You can practice the above code in the code playground below. Press the "Run" button and wait for the output tab to show the Jupyter Notebook. Alternatively, you can click the link beside the "Run" button to open the respective Jupyter Notebook in a new tab.
Become a Data Analyst with our comprehensive learning path!
If you're ready to kickstart your career as a data analyst, then our Become a Data Analyst path is designed to take you from your first line of code to landing your first job.
Whether you’re a beginner or looking to transition into a data-driven career, this step-by-step journey will equip you with the skills to turn raw data into actionable insights. Develop expertise in data cleaning, analysis, and storytelling to make informed decisions and drive business success. With our AI mentor by your side, you’ll tackle challenges with personalized guidance. Start your data analytics career today and make your mark in the world of data!
Frequently asked questions
Haven’t found what you were looking for? Contact Us
What is the difference between pandas and GeoPandas?
What I can do with GeoPandas?
Free Resources