How to use the cumsum() function on a DataFrame in pandas
What is a DataFrame?
DataFrame is a commonly used 2-dimensional data structure. It is a table with columns and rows and is mostly used as an object in pandas.
DataFrame can be formed as shown below.
Requirements
It requires the pandas library as shown below.
import pandas as pd
Code
Example
Below is a DataFrame that contains countries that have been put in different groups and are given a different a_score and b_score.
Both scores are imaginary values for the purpose of this example.
import pandas as pda_score = [4, 5, 7, 8, 2, 3, 1, 6, 9, 10]b_score = [1, 2, 3, 4, 5, 6, 7, 10, 8, 9]country = ['Pakistan', 'USA', 'Canada', 'Brazil', 'India', 'Beligium', 'Malaysia', 'Peru', 'England', 'Scotland']groups = ['A','A','B','A','B','B','C','A','C','C']df = pd.DataFrame({'group':groups, 'country':country, 'a_score':a_score, 'b_score':b_score})print(df)
cumsum() function
The cumsum() function allows the calculation of the cumulative sum.
Syntax
The function prototype is as follows.
df.cumsum(axis = 1)
Parameter
Any axis whose cumulative sum is to be taken.
Return value
The function returns the cumulative sum.
Example
The following example takes the cumulative sum of the b_score in the DataFrame we formed above.
import pandas as pda_score = [4, 5, 7, 8, 2, 3, 1, 6, 9, 10]b_score = [1, 2, 3, 4, 5, 6, 7, 10, 8, 9]country = ['Pakistan', 'USA', 'Canada', 'Brazil', 'India', 'Beligium', 'Malaysia', 'Peru', 'England', 'Scotland']groups = ['A','A','B','A','B','B','C','A','C','C']df = pd.DataFrame({'group':groups, 'country':country, 'a_score':a_score, 'b_score':b_score})df['cumsum_b'] = df[['b_score','group']].groupby('group').cumsum()print(df)