What is a choropleth map?

What is a choropleth map?

8 mins read
Oct 31, 2025
Share
editor-page-cover
Content
Geospatial analysis
Geopandas library
Choropleth maps
Categorical choropleth map
Case Study: Visualizing the global COVID-19 vaccination rate
Use the new geodatasets package for loading data
Choose the right map projection
Normalize your data before mapping
Control classification with mapclassify
When not to use a choropleth map
Understand spatial Pitfalls: MAUP and ecological fallacy
Use color palettes thoughtfully
Explore advanced techniques: bivariate and interactive maps
Work with real data in your case studies
Conclusion

A choropleth map displays various data in different regions to visualize geographical data on a region-by-region basis. Due to significant progress in geospatial analysis, the market now offers various map plotting techniques. In this blog, we will study choropleth maps and plot them using the GeoPandas Python library.

Geospatial analysis#

Geospatial analysis examines, interprets, and manipulates geographic data and information associated with specific locations on the Earth’s surface. It consists of various techniques and methodologies for understanding spatial patterns, relationships, and trends within geographic data.

Geopandas library#

GeoPandas is an open-source Python library that extends the capabilities of pandas, a widely used data manipulation library, to handle geospatial data more efficiently. It provides a user-friendly and powerful interface for working with geospatial datasets.

GeoPandas helps read, write, visualize, and analyze geographic data in various formats, such as shapefiles, GeoJSON, Geospatial Data Abstraction Library (GDAL) formats, and more. GeoPandas allows seamless integration with data analysis workflows, making combining geospatial data with non-spatial data and performing complex spatial operations easier.

We can easily use this library with just one command:

pip install geopandas

Let’s learn about choropleth maps, their types, and their advantages.

Choropleth maps#

Choropleth originates from combining two Greek words: choros, which signifies region, and plethos, which means multitude. This type of map displays various data in different regions to visualize geographical data on a region-by-region basis. These maps represent data using different colors or shading patterns to indicate the variation in a specific variable across geographic areas, such as countries, states, provinces, counties, or other administrative divisions.

Let’s create our first choropleth map in Python:

Python 3.5
import geopandas as gpd
import matplotlib.pyplot as plt
world_map = gpd.read_file(gpd.datasets.get_path('naturalearth_lowres'))
fig, ax = plt.subplots(figsize=(10, 6))
variable = "pop_est"
cmap = "viridis"
world_map.plot(column=variable, cmap=cmap, linewidth=0.8, ax=ax, edgecolor='0.8', legend=True)
ax.set_ylabel('Latitude')
ax.set_xlabel('Longitude')
plt.show()
  • Line 1: We import the geopandas library.

  • Line 2:  We import the pyplot module from the matplotlib library, which is used for creating plots and visualizations.

  • Line 4:  We retrieve the file path for the built-in Natural Earth dataset called naturalearth_lowres in geopandas to the variable world_map which contains low-resolution geometries and attributes of countries. The gpd.read_file method reads the data from the file specified in the argument (in this case, the Natural Earth dataset). It returns a GeoDataFrame, a specialized data structure in GeoPandas for handling geospatial data.

  • Line 6: We define the variable to be displayed on the choropleth map. In this case, it is set to "pop_est", representing the population estimate attribute in the dataset.

  • Line 7:  We define the colormap to be used for the choropleth map. The term "viridis" represents the perceptually uniform sequential colormap.

  • Line 8: We plot the choropleth map using the GeoDataFrame world_map.

    • column=variable: Specifies the column in the GeoDataFrame that contains the data to be visualized. Here, it is set to the value of the variable, which is "pop_est".

    • cmap=cmap: Sets the colormap for the choropleth map.

    • linewidth=0.8: Sets the width of the boundary lines between the polygons on the map.

    • ax=ax: Specifies the axes on which to plot the map. In this case, it uses the ax created earlier in Line 5 with plt.subplots.

    • edgecolor='0.8': Sets the color of the boundary lines between the polygons.

Here is the result of the aforementioned code:

World choropleth map—population estimate
World choropleth map—population estimate

Categorical choropleth map#

To create a categorical choropleth map with legends, we’ll use a modified version of the world map data that contains categorical data, such as regions or categories for different countries. For this example, we’ll use the Natural Earth dataset again, but create a new column called Category to represent the categories for each country.

Here’s the code to create a categorical choropleth map with legends:

Python 3.5
import geopandas as gpd
import matplotlib.pyplot as plt
world_map = gpd.read_file(gpd.datasets.get_path('naturalearth_lowres'))
categories = {
'United States': 'North America',
'Canada': 'North America',
'Brazil': 'South America',
'China': 'Asia',
'India': 'Asia',
'Australia': 'Oceania',
'France': 'Europe',
'South Africa': 'Africa',
}
world_map['Category'] = world_map['name'].map(categories)
fig, ax = plt.subplots(figsize=(20, 10))
variable = 'Category'
cmap = 'Set1'
world_map.plot(column=variable, cmap=cmap, linewidth=0.8, ax=ax, edgecolor='0.8', legend=True)
ax.set_ylabel('Latitude')
ax.set_xlabel('Longitude')
plt.show()
  • Lines 5–14: We define a dictionary named categories. It maps country names to their respective categories or regions.

  • Line 16: We add a new column Category to the world_map GeoDataFrame. It maps the values in the name column (country names) to the categories defined in the categories dictionary and assigns the corresponding category to each country.

The output of this code will be the following map:

World categorical choropleth map
World categorical choropleth map

Remember, these maps are versatile tools that can uncover insights with just a glance and can be employed in a wide range of real-world scenarios.

Let's explore an application scenario for choropleth maps.

Case Study: Visualizing the global COVID-19 vaccination rate#

In response to the COVID-19 pandemic, governments worldwide launched vaccination campaigns to curb the spread of the virus. To assess the progress of these campaigns, we are required to generate a choropleth map that visualizes the hypothetical COVID-19 vaccination rates by country. Let’s have a look at the code:

Python 3.5
import geopandas as gpd
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
world = gpd.read_file(gpd.datasets.get_path('naturalearth_lowres'))
# Create dummy vaccination data for all countries
np.random.seed(42) # For reproducibility
vaccination_data = {
'Country': world['name'],
'Vaccination Rate (%)': np.random.randint(0, 100, len(world)) # Generate random values
}
vaccination_df = pd.DataFrame(vaccination_data)
merged_data = world.merge(vaccination_df, left_on='name', right_on='Country')
fig, ax = plt.subplots(figsize=(12, 8))
variable = 'Vaccination Rate (%)'
cmap = 'YlGnBu'
merged_data.plot(column=variable, cmap=cmap, linewidth=0.8, ax=ax, edgecolor='0.8', legend=True)
ax.set_ylabel('Latitude')
ax.set_xlabel('Longitude')
plt.show()
  • Line 10: We set the random seed for reproducibility, ensuring that random numbers generated are consistent across runs.

  • Lines 11–14: A dictionary is created containing two keys: Country and Vaccination Rate (%). The Country key gets values from the name column of the world DataFrame. The Vaccination Rate (%) and key is populated with random integer values (between 0 and 100) generated using np.random.randint() for the same length as the world DataFrame.

The output of this code will be the following map:

Hypothetical COVID-19 vaccination rate by country
Hypothetical COVID-19 vaccination rate by country

Use the new geodatasets package for loading data#

In recent versions of GeoPandas, the built-in datasets module has been deprecated. Instead, you should now use the geodatasets package to access sample geographic data. This ensures compatibility with the latest ecosystem and avoids warnings in newer environments:

import geopandas as gpd
from geodatasets import get_path
world = gpd.read_file(get_path("naturalearth.land"))

This simple change keeps your code future-proof and aligned with modern best practices.

Choose the right map projection#

Choropleth maps are designed to compare data across geographic regions, but many default projections (like Mercator or Plate Carrée) distort the size of countries — especially near the poles.
As of 2025, it’s considered best practice to reproject your data into an equal-area projection before mapping.

Here’s how to do it:

world = world.to_crs("ESRI:54052") # Equal Earth projection

Using an equal-area projection ensures your visualizations reflect true area proportions, making data comparisons more accurate and meaningful.

Normalize your data before mapping#

One of the most common mistakes in choropleth mapping is visualizing raw counts (like total population). Because regions vary widely in size and population, these maps can mislead viewers.
Instead, always normalize your data — for example, by calculating per-capita rates or percentages.

world["pop_density"] = world["pop_est"] / world["area_km2"]
world.plot(column="pop_density", legend=True, cmap="viridis")

This small step transforms your visualization from a simple map into a meaningful comparison tool.

Control classification with mapclassify#

By default, choropleth maps use continuous color scales. But in many cases, classifying data into ranges provides clearer patterns and better storytelling.
The mapclassify library (which integrates directly with GeoPandas) lets you control how your data is grouped:

world.plot(
column="pop_density",
scheme="Quantiles",
k=5,
legend=True,
cmap="Blues"
)

Popular classification methods:

  • Quantiles: Ensures an equal number of regions in each class.

  • Natural Breaks (Jenks): Highlights natural groupings in the data.

  • Equal Interval: Divides the range into evenly spaced intervals.

Always explain why you chose a particular method — it improves transparency and interpretability.

When not to use a choropleth map#

Choropleths aren’t always the right choice. They work best for standardized, continuous variables (like percentages or rates).
If you’re mapping absolute counts, consider alternative visualizations:

  • Proportional symbol maps: Represent counts with varying marker sizes.

  • Dot density maps: Show distribution patterns more directly.

  • Cartograms: Distort shapes to reflect data values.

Choosing the right visualization type ensures your insights are accurate and easy to understand.

Understand spatial Pitfalls: MAUP and ecological fallacy#

Spatial data can be tricky. Two common problems to be aware of:

  • MAUP (Modifiable Areal Unit Problem):
    Your results can change based on how regions are defined (e.g., counties vs. states).
    A map created using large administrative boundaries might hide local variations, while a finer-grained map could exaggerate minor fluctuations.
    To mitigate MAUP, try running your analysis at multiple scales or using consistent boundary definitions across time-series data.

  • Ecological fallacy:
    Patterns observed at the regional level don’t always apply to individuals within those regions.
    For example, a county with a high literacy rate does not mean every individual there is literate.
    When presenting results, be clear about the level of aggregation and avoid drawing conclusions about individual behavior from aggregated data.

Including these caveats not only strengthens your analysis but also teaches readers how to interpret spatial patterns critically.
Adding brief context in your captions or methodology section can go a long way in preventing misinterpretation.

Use color palettes thoughtfully#

Color is more than aesthetics — it communicates meaning. Choose color schemes that align with your data type:

  • Sequential palettes (e.g., Blues, Viridis) for ordered data like percentages.

  • Diverging palettes (e.g., RdBu, PuOr) for data with a meaningful midpoint.

  • Qualitative palettes for categorical data.

Also, consider colorblind-friendly palettes from libraries like ColorBrewer to make your maps more accessible.

Explore advanced techniques: bivariate and interactive maps#

Once you’re comfortable with basic choropleths, you can experiment with more advanced approaches:

  • Bivariate choropleths: Combine two variables in a single map using a two-dimensional color scale.

  • Interactive visualizations: Use libraries like Folium, Plotly, or Altair to add tooltips, zooming, and filtering.

  • Vector tile rendering: For very large datasets, consider tools like deck.gl or MapLibre for performance and scalability.

These techniques help you go beyond static maps and build richer, more engaging geographic stories.

Work with real data in your case studies#

If you’re demonstrating a real-world use case — like vaccination rates, income levels, or environmental indicators — try to use authentic datasets from trusted sources (e.g., World Bank, WHO, or national statistics agencies). If synthetic data is necessary for demonstration, clearly label it as such to avoid confusion.

Using real-world datasets also allows you to demonstrate practical challenges, such as dealing with missing values, handling inconsistent region names, and normalizing data from multiple sources. For example, you could build a choropleth map showing carbon emissions per capita across countries using data from the Global Carbon Atlas, or visualize regional unemployment rates using data from a national statistics bureau.

Additionally, consider combining your choropleth with complementary visualizations — such as bar charts or line graphs — to add more context and help users interpret geographic trends alongside historical or categorical information.

Conclusion #

This blog has introduced geospatial analysis, GeoPandas library, and the exciting world of choropleth maps. Our exploration dived deeper into the details of choropleth maps and explored the diverse scenarios in which they apply.

If you want to learn more about choropleth maps, look no further! Check out the exciting new courses available on the Educative platform:

  1. Interactive Dashboards and Data Apps with Plotly and Dash

  2. Introduction to Data Science with Python

  3. Introduction to Geospatial Analysis with Python and GeoPandas


Written By:
Nimra Zaheer