Challenge 2: Merge with Missing Values (Medium)

This challenge expects you to merge DataFrames that have missing information when combined.

Problem definition

your music analyst now has two datasets with complementary information.

The first dataset contains information about bands, and their associated data. However, instead of names, it refers to countries by their IDs. Here are three sample rows:

artist country plays genre fans
The Beatles 1 150 rock 50
Iron Maiden 1 20000 metal 3500
Judas Priest 1 5000 metal 1000
Leprous 5 1000 metal 500
Rush 6 3000 rock 500

The second is a simple mapping between the ID of a country and its name.

country_id name
1 UK
2 US
3 Egypt
4 Finland

This is very similar to the previous dataset in Challenge 1, but with a small difference. In this case, there are some artists in the dataset with an associated country ID that doesn’t exist in the countries table. This might be due to the data either getting corrupted, or inserted incorrectly.

Your music analyst would like to know how many plays were affected due to this error. Can you help the music analyst find the sum of plays of all the artists without a country name associated with them?

Level up your interview prep. Join Educative to access 70+ hands-on prep courses.