Search⌘ K
AI Features

Solution: Merging Datasets

Understand how to merge datasets effectively in Plotly Express by reshaping data frames between long and wide formats, applying the merge function, and cleaning the resulting dataset by removing null values. This lesson helps you prepare consistent data for interactive visualizations.

We'll cover the following...

The solution to the problem that merges the two datasets drops the NULL values in the resulting dataset and prints the final dataset given below.

Solution

C++
## Importing libraries
import pandas as pd
## Loading data
data = pd.read_csv('../data/PovStatsData.csv')
country = pd.read_csv('../data/PovStatsCountry.csv', na_values='', keep_default_na=False)
data = data.drop('Unnamed: 50', axis=1)
# Melting DataFrames
data_melt = pd.melt(data, id_vars=id_vars, var_name='year').dropna(subset=['value'])
data_melt['year'] = data_melt['year'].astype(int)
# Creating the is_country column
country['is_country'] = country['Region'].notna()
## pivoting melted DataFrame
data_pivot = data_melt.pivot(index=['Country Name', 'Country Code', 'year'],
columns='Indicator Name',
values='value').reset_index()
# code from below
poverty = pd.merge(data_pivot, country, left_on='Country Code', right_on='Country Code', how='left')
# "High Income" is NA so we fill it with False values, as it is not a country
poverty['is_country'] = poverty['is_country'].fillna(False)
print(poverty.head())

Explanation

...