How to make a copy of a data frame in pandas
Overview
The copy method is used to make a copy of the given DataFrame. There are two ways a DataFrame is copied:
- Deep copy: It creates a new DataFrame with a copy of the data and indices of the given DataFrame. Changes to the copy’s data or indices will not be reflected in the original DataFrame.
- Shallow copy: It creates a new DataFrame without copying the data or index of the caller object (only references to the data and index are copied). Any modifications to the original’s data will be mirrored in the copy (and vice versa).
The method’s default behavior is the deep copy. Set the parameter deep to False to enable shallow copy.
Note: Refer to What is pandas in Python? to learn more about pandas.
Syntax
DataFrame.copy(deep=True)
Parameter
deep is a boolean parameter that indicates whether to make a deep or a shallow copy. If True, a deep copy is made. Otherwise, a shallow copy is made.
Code example (deep copy)
Let’s look at the code below:
import pandas as pddata = [['dom', 10], ['abhi', 15], ['celeste', 14]]df = pd.DataFrame(data, columns = ['Name', 'Age'])df_deep_copy = df.copy(deep=True)print("Original Dataframe - \n")print(df)print("Deep Copy Dataframe - \n")print(df_deep_copy)print("\n")print("Changing value in original dataframe\n")df.iloc[0,1] = -9print("Original Dataframe after changes - \n")print(df)print("Deep Copy Dataframe after changes - \n")print(df_deep_copy)
Explanation
- Line 1: We import the
pandasmodule. - Lines 2 to 3: We create a dataframe called
df. - Line 6: We get a deep copy of
dfcalleddf_deep_copyusing thecopymethod withdeepargument asTrue. - Lines 8 to 12: We print the
dfanddf_deep_copy. - Line 16: We modify the
Agecolumn for one of the rows indf. - Lines 18 to 22: We print
dfanddf_deep_copy.
In the above code, when we modify the original dataframe, it doesn’t affect the copy of the dataframe.
Code example (shallow copy)
Let’s look at the code below:
import pandas as pddata = [['dom', 10], ['abhi', 15], ['celeste', 14]]df = pd.DataFrame(data, columns = ['Name', 'Age'])df_shallow_copy = df.copy(deep=False)print("Original Dataframe - \n")print(df)print("Shallow Copy Dataframe - \n")print(df_shallow_copy)print("\n")print("Changing value in original dataframe\n")df.iloc[0,1] = -9print("Original Dataframe after changes - \n")print(df)print("Shallow Copy Dataframe after changes - \n")print(df_shallow_copy)
Explanation
- Line 1: We import the
pandasmodule. - Lines 2 to 3: We create a dataframe called
df. - Line 6: We get a shallow copy of
dfcalleddf_shallow_copyusing thecopymethod withdeepargument asTrue. - Lines 8 to 12: We print
dfanddf_shallow_copy. - Line 16: We modify the
Agecolumn for one of the rows indf. - Lines 18 to 22: We print
dfanddf_shallow_copy.
In the above code, when we modify the original dataframe, it reflects the changes in the copy of the dataframe.