Trusted answers to developer questions
Trusted Answers to Developer Questions

Related Tags

pandas
python
communitycreator

How to use the where function on a dataframe in pandas

Sheza Naveed

What is a dataframe?

A dataframe is a commonly used 2-dimensional data structure. It is a table with columns and rows and is mostly used as a pandas object.

A sample dataframe

Dataframes require the pandas library, as shown below:

import pandas as pd

A dataframe can be formed as shown in the example below, which creates a dataframe that contains countries that have been put in different groups and are given different a_scores and b_scores. Both the scores are imaginary values for the purpose of this example.

import pandas as pd

a_score = [4, 5, 7, 8, 2, 3, 1, 6, 9, 10]
b_score = [1, 2, 3, 4, 5, 6, 7, 10, 8, 9]
country = ['Pakistan', 'USA', 'Canada', 'Brazil', 'India', 'Beligium', 'Malaysia', 'Peru', 'England', 'Scotland']
groups = ['A','A','B','A','B','B','C','A','C','C']
df = pd.DataFrame({'group':groups, 'country':country, 'a_score':a_score, 'b_score':b_score})
print(df)
Example dataframe

The where function

The where function allows the replacement of values in rows or columns based on a specified condition.

Prototype

The function prototype is as follows:

df['new_column'].where(df['new_column'] > 4 , 9)

All values greater than 4 are selected, and the remaining values are replaced by 9.

Parameters

  • The condition
  • The replacement value

If the replacement is not provided, the values that fulfill the condition are replaced by NaN.

Return value

where returns the dataframe with replaced values.

Code

The example below replaces all values less than 5 in the b_score by 20.

import pandas as pd

a_score = [4, 5, 7, 8, 2, 3, 1, 6, 9, 10]
b_score = [1, 2, 3, 4, 5, 6, 7, 10, 8, 9]
replace = [-1, -2, -3, -4, -5, 1, 2, 3, 4, 5]
country = ['Pakistan', 'USA', 'Canada', 'Brazil', 'India', 'Beligium', 'Malaysia', 'Peru', 'England', 'Scotland']
groups = ['A','A','B','A','B','B','C','A','C','C']
df = pd.DataFrame({'group':groups, 'country':country, 'a_score':a_score, 'b_score':b_score, 'replace': replace})

df['replaced'].where(df['replace'] > 0 , 0)
An example of using where()

RELATED TAGS

pandas
python
communitycreator
RELATED COURSES

View all Courses

Keep Exploring