How to perform cross-tabulation on a column in pandas

In pandas, cross-tabulation is used to compute the cross-tabulation of two factors. With the help of cross-tabulation we can find the frequency distribution of the variables.

Syntax

pandas.crosstab(index, columns, values = None, rownames = None, colnames = None, aggfunc = None, margins = False, margins_name = 'All', dropna = True, normalize = False)

Syntax of crosstab function

To use cross-tabulation, we call the built-in function pandas.crosstab. In the index option, we pass the value that is used as a row, and in the column, we pass the values that will be used as the columns. The other options like values, rownames, and colnames are optional and can be used whenever they are required. Otherwise, they will be processed in their built-in state.

Example

For instance, consider a table with the following data:

Employee Name	Nationality	Gender
Jerry	Germany	Male
Harry	USA	Male
Emma	USA	Female
Amalia	China	Female

Suppose we want to find out how many of the employees are males and females from each country. Pandas helps us do this via the crosstab() function:

import numpy as np
import pandas as pd
employee_name = np.array(["Jerry", "Harry", "Emma", "Amalia"], dtype = object)
nationality = np.array(["Germeny", "USA", "USA", "China"], dtype = object)
gender = np.array(["Male", "Male", "Female", "Female"], dtype = object)
print(pd.crosstab(nationality, gender, rownames = ['Nationality'], colnames = ['Gender']))

We see the results below:

Nationality	Male	Female
USA	1	1
Germany	1	0
UK	0	1

Explanation

Line 4–6: Store data in the employe_name, nationality, and gender variables respectively.
Line 7: Call the crosstab() function to implement cross-tabulation on the data and pass the respective row and column.

Relevant Answers

Explore Courses

Free Resources

Copyright ©2026 Educative, Inc. All rights reserved