Trusted answers to developer questions

How to use the isin function on a dataframe in pandas

Get Started With Machine Learning

Learn the fundamentals of Machine Learning with this free course. Future-proof your career by adding ML skills to your toolkit — or prepare to land a job in AI or Data Science.

What is a dataframe?

Dataframe is a commonly used two-dimensional data structure. It is a table with columns and rows and is mostly used as a pandas object.

example dataframe

It requires the pandas library as shown below.

import pandas as pd

A dataframe can be formed as shown below. The following is a dataframe that contains countries that have been put in different groups and are given different a_score and b_scores. Both the scores are imaginary values for the purpose of this example.

import pandas as pd
a_score = [4, 5, 7, 8, 2, 3, 1, 6, 9, 10]
b_score = [1, 2, 3, 4, 5, 6, 7, 10, 8, 9]
country = ['Pakistan', 'USA', 'Canada', 'Brazil', 'India', 'Beligium', 'Malaysia', 'Peru', 'England', 'Scotland']
groups = ['A','A','B','A','B','B','C','A','C','C']
df = pd.DataFrame({'group':groups, 'country':country, 'a_score':a_score, 'b_score':b_score})
print(df)

isin function

The isin function is an advanced filtering method. It allows filtering of values based on another list of choices that have been selected.

Prototype

The function prototype is as follows.

selection = ['Pakistan','USA','Belgium']
df[df.country.isin(selection)]

Parameter

The selected filter is the parameter for this function.

Return value

This function returns the filtered values.

Code

import pandas as pd
a_score = [4, 5, 7, 8, 2, 3, 1, 6, 9, 10]
b_score = [1, 2, 3, 4, 5, 6, 7, 10, 8, 9]
country = ['Pakistan', 'USA', 'Canada', 'Brazil', 'India', 'Belgium', 'Malaysia', 'Peru', 'England', 'Scotland']
groups = ['A','A','B','A','B','B','C','A','C','C']
df = pd.DataFrame({'group':groups, 'country':country, 'a_score':a_score, 'b_score':b_score})
selection = ['Pakistan','USA','Belgium']
print(df[df.country.isin(selection)])

RELATED TAGS

pandas
python
dataframe
datastructure
Did you find this helpful?