Trusted answers to developer questions
Trusted Answers to Developer Questions

Related Tags

pandas
python

What is loc in pandas?

Hassaan Waqar

Grokking Modern System Design Interview for Engineers & Managers

Ace your System Design Interview and take your career to the next level. Learn to handle the design of applications like Netflix, Quora, Facebook, Uber, and many more in a 45-min interview. Learn the RESHADED framework for architecting web-scale applications by determining requirements, constraints, and assumptions before diving into a step-by-step design process.

The pandas library in Python is used to work with dataframes that structure data in rows and columns. It is widely used in data analysis and machine learning.

The loc operator is used to index a portion of the dataframe. loc supports indexing both by row and column names and by using boolean expressions.

Indexing using rows and columns

The loc operator can take in two arguments: rows and columns.

Rows will be in the form of row numbers, whereas column names need to be specified for columns. The syntax is as follows:

dataframe.loc[rows, columns]

Row numbers are inclusive in loc.

We can mention row numbers in the form of a range, such as 0:5. The syntax will be as follows:

df.loc[0:5, "column1"]

We can also index rows separately by enclosing them as a list. The syntax will be as follows:

df.loc[[2,4,5], "column1"]

Similarly, we can index a single column using the column name. If we do not enclose it within [], a series is returned. The syntax will be as follows:

df.loc[[2,4,5], "column1"]

If we enclose it within [], a dataframe is returned. The syntax is as follows:

df.loc[[2,4,5], ["column1"]]

Example

The code snippet below shows how we can use the loc operator for rows and columns:

import pandas as pd
# Creating a dataframe
df = pd.DataFrame({'Sports': ['Football', 'Cricket', 'Baseball', 'Basketball',
'Tennis', 'Table-tennis', 'Archery', 'Swimming', 'Boxing'],
'Player': ["Messi", "Afridi", "Chad", "Johnny", "Federer",
"Yong", "Mark", "Phelps", "Khan"],
'Rank': [1, 9, 7, 12, 1, 2, 11, 1, 1] })
print(df.loc[0:5, ['Player', 'Rank']]) # using row range and multiple columns
print('\n')
print(df.loc[[1,2,3], "Player"]) # Using specific rows and returning a series
print('\n')
print(df.loc[[1,2,3], ["Player"]]) # Using specific rows and returning a dataframe

Indexing using a boolean expression

We can also index the dataframe by placing boolean expressions within loc. The syntax is as follows:

dataframe.loc[expression]

Boolean expressions use conditions and operators, such as ==, >, and <.

Example

The code snippet below shows loc using boolean expressions:

import pandas as pd
# Creating a dataframe
df = pd.DataFrame({'Sports': ['Football', 'Cricket', 'Baseball', 'Basketball',
'Tennis', 'Table-tennis', 'Archery', 'Swimming', 'Boxing'],
'Player': ["Messi", "Afridi", "Chad", "Johnny", "Federer",
"Yong", "Mark", "Phelps", "Khan"],
'Rank': [1, 9, 7, 12, 1, 2, 11, 1, 1] })
print(df.loc[df["Rank"]== 1])
print('\n')
print(df.loc[df["Sports"] == "Football"])

RELATED TAGS

pandas
python

CONTRIBUTOR

Hassaan Waqar
Copyright ©2022 Educative, Inc. All rights reserved

Grokking Modern System Design Interview for Engineers & Managers

Ace your System Design Interview and take your career to the next level. Learn to handle the design of applications like Netflix, Quora, Facebook, Uber, and many more in a 45-min interview. Learn the RESHADED framework for architecting web-scale applications by determining requirements, constraints, and assumptions before diving into a step-by-step design process.

Keep Exploring