How to use the loc and iloc functions on a DataFrame in pandas
What is a DataFrame?
A DataFrame is a commonly used 2-dimensional data structure. It is a table that consists of columns and rows and is primarily used as a pandas object.
Requirements
DataFrames require the pandas library, as shown below.
import pandas as pd
Code
A DataFrame can be formed as shown below.
Example
In this example, we create a DataFrame that contains countries that have been put in different groups and are given different a_score and b_score.
Both scores are imaginary values for this example.
import pandas as pda_score = [4, 5, 7, 8, 2, 3, 1, 6, 9, 10]b_score = [1, 2, 3, 4, 5, 6, 7, 10, 8, 9]country = ['Pakistan', 'USA', 'Canada', 'Brazil','India', 'Beligium', 'Malaysia', 'Peru','England', 'Scotland']groups = ['A', 'A', 'B', 'A', 'B', 'B', 'C', 'A', 'C', 'C']df = pd.DataFrame({'group':groups,'country':country,'a_score':a_score,'b_score':b_score})print(df)
The loc and iloc functions
The loc and iloc functions allow the selection of rows and columns.
-
loc[]: selection by labels -
iloc[]: selection by positions
Upper boundaries are included when you use
loc, and are excluded when you useiloc.
Syntax
The prototypes of the loc and iloc functions are as follows.
df.loc[3:, ['country', 'a_score']]
df.iloc[2:, 3:]
Parameters
-
loc: the labels you want to select -
iloc: the positions you want to select
Return value
These functions return the filtered values.
Example
The example below selects the first 3 rows and last 2 columns with loc and iloc.
import pandas as pda_score = [4, 5, 7, 8, 2, 3, 1, 6, 9, 10]b_score = [1, 2, 3, 4, 5, 6, 7, 10, 8, 9]country = ['Pakistan', 'USA', 'Canada', 'Brazil','India', 'Beligium', 'Malaysia', 'Peru','England', 'Scotland']groups = ['A', 'A', 'B', 'A', 'B', 'B', 'C', 'A', 'C', 'C']df = pd.DataFrame({'group':groups,'country':country,'a_score':a_score,'b_score':b_score})print("loc")print(df.loc[:2, ['country', 'group']])print("iloc")print(df.iloc[:3, 2:])