How to create a new column with pandas

The pandas library in Python is a robust and powerful tool for data analysis. In this shot, we will go over some ways to utilize the library to create a new column in an existing data frame.

Dataframe indexing

The simplest way to add a new column to an existing panda’s data frame is to index the data frame with the new column’s name and assign a list to it:

import pandas as pd
# Create a new DataFrame
df = pd.DataFrame({'Name': ['Ali', 'Aqsa', 'Armaan', 'Arij'],
'Age': [34, 26, 56, 44],
'Position': ['Senior Engineer', 'Junior Engineer', 'HR Officer', 'COO']})
print("Dataframe before adding new column:")
print(df)
# Adding salary column by indexing and assigning a list
df['salary'] = [200000, 70000, 110000, 670000]
print("Dataframe after adding new column:")
print(df)

Using the assign method

Another way of introducing a column in the data frame is by using the in-built assign method, which creates a new data frame with the added column. The Python code below shows how this can be done:

import pandas as pd
# Create a new DataFrame
df = pd.DataFrame({'Name': ['Ali', 'Aqsa', 'Armaan', 'Arij'],
'Age': [34, 26, 56, 44],
'Position': ['Senior Engineer', 'Junior Engineer', 'HR Officer', 'COO']})
print("Dataframe before adding new column:")
print(df)
# Adding salary column using the assign method
df2 = df.assign(salary = [200000, 70000, 110000, 670000])
print("Dataframe after adding new column:")
print(df2)

Using the insert method

The insert method is another useful data frame method that can be used to create a new column. Unlike the previous techniques, which simply appended a column to the end of the data frame, the insert method allows you to add the new column in any specified position. Here’s how the method is used:

import pandas as pd
# Create a new DataFrame
df = pd.DataFrame({'Name': ['Ali', 'Aqsa', 'Armaan', 'Arij'],
'Age': [34, 26, 56, 44],
'Position': ['Senior Engineer', 'Junior Engineer', 'HR Officer', 'COO']})
print("Dataframe before adding new column:")
print(df)
# Adding salary column to the first index using the insert method
df.insert(1, "salary", [200000, 70000, 110000, 670000])
print("Dataframe after adding new column:")
print(df)

Free Resources