A DataFrame is a commonly used 2-dimensional data structure. It is a table that consists of columns and rows and is primarily used as a pandas
object.
DataFrames require the pandas
library, as shown below.
import pandas as pd
A DataFrame can be formed as shown below.
In this example, we create a DataFrame that contains countries that have been put in different groups and are given different a_score
and b_score
.
Both scores are imaginary values for this example.
import pandas as pd a_score = [4, 5, 7, 8, 2, 3, 1, 6, 9, 10] b_score = [1, 2, 3, 4, 5, 6, 7, 10, 8, 9] country = ['Pakistan', 'USA', 'Canada', 'Brazil', 'India', 'Beligium', 'Malaysia', 'Peru', 'England', 'Scotland'] groups = ['A', 'A', 'B', 'A', 'B', 'B', 'C', 'A', 'C', 'C'] df = pd.DataFrame({'group':groups, 'country':country, 'a_score':a_score, 'b_score':b_score}) print(df)
loc
and iloc
functionsThe loc
and iloc
functions allow the selection of rows and columns.
loc[]
: selection by labels
iloc[]
: selection by positions
Upper boundaries are included when you use
loc
, and are excluded when you useiloc
.
The prototypes of the loc
and iloc
functions are as follows.
df.loc[3:, ['country', 'a_score']]
df.iloc[2:, 3:]
loc
: the labels you want to select
iloc
: the positions you want to select
These functions return the filtered values.
The example below selects the first 3 rows and last 2 columns with loc
and iloc
.
import pandas as pd a_score = [4, 5, 7, 8, 2, 3, 1, 6, 9, 10] b_score = [1, 2, 3, 4, 5, 6, 7, 10, 8, 9] country = ['Pakistan', 'USA', 'Canada', 'Brazil', 'India', 'Beligium', 'Malaysia', 'Peru', 'England', 'Scotland'] groups = ['A', 'A', 'B', 'A', 'B', 'B', 'C', 'A', 'C', 'C'] df = pd.DataFrame({'group':groups, 'country':country, 'a_score':a_score, 'b_score':b_score}) print("loc") print(df.loc[:2, ['country', 'group']]) print("iloc") print(df.iloc[:3, 2:])
RELATED TAGS
CONTRIBUTOR
View all Courses