Search⌘ K
AI Features

Comparing Ratings by Gender and Plotting Them

Explore how to use Python pandas to analyze and compare movie ratings by gender. Learn to calculate rating differences, sort data with absolute values, and visualize insights with horizontal bar charts. This lesson equips you with skills to manage datasets and produce demographic-based visualizations using real-world data.

We'll cover the following...

Let’s see which movies are top-rated by females in the movie lens database.

Look at the highlighted line 26 of the code widget below. We’re sorting ratings_by_gender in the 'F’ column, which stands for female:

Python 3.8
import pandas as pd
import matplotlib.pyplot as plt
def movie(nogui=False, movielenspath=''):
user_columns = ['user_id', 'age', 'gender']
users = pd.read_csv(movielenspath + 'movie_lens/u.user', sep='|', names=user_columns, usecols=range(3))
rating_columns = ['user_id', 'movie_id', 'rating']
ratings = pd.read_csv(movielenspath + 'movie_lens/u.data', sep='\t', names=rating_columns, usecols=range(3))
movie_columns = ['movie_id', 'title']
movies = pd.read_csv(movielenspath + 'movie_lens/u.item', sep='|', names=movie_columns, usecols=range(2), encoding="iso-8859-1")
# create one merged DataFrame
movie_ratings = pd.merge(movies, ratings)
movie_data = pd.merge(movie_ratings, users)
ratings_by_title = movie_data.groupby('title').size()
popular_movies = ratings_by_title.index[ratings_by_title >= 250]
ratings_by_gender = movie_data.pivot_table('rating', index='title',columns='gender')
ratings_by_gender = ratings_by_gender.loc[popular_movies]
top_movies_individuals_tagged_as_female = ratings_by_gender.sort_values(by='F', ascending=False)
print("Top rated movies by individuals_tagged_as_female \n", top_movies_individuals_tagged_as_female.head())
print("\n")
if __name__ == "__main__":
movie()

As we can see, most ...