Rename Attributes
Explore how to rename columns in both Pandas and PySpark DataFrames to follow consistent naming standards using snake case. This lesson guides you through mapping and applying new column names in Python and PySpark environments to improve data clarity and management.
We'll cover the following...
We'll cover the following...
In an organization, data is generated from different sources, systems, and processes before it reaches us, so the column naming might surprise us. We’ll encounter different ...
COL_NAME_MAP = {
"overall": "overall",
"verified": "verified",
"reviewTime": "review_time",
"reviewerID": "reviewer_id",
"asin": "asin",
"reviewerName": "reviewer_name",
"reviewText": "review_text",
"summary": "summary",
"unixReviewTime": "unix_review_time",
"style": "style",
"vote": "vote",
"image": "image"
}
print('Initial Columns names:')
i=1
for col_name in raw_pdf.columns:
print(f'{i}: {col_name}')
i=i+1
## renaming column names
raw_pdf = raw_pdf.rename(columns=COL_NAME_MAP)
print('___________________________')
print('Columns names after rename:')
i=1
for col_name in raw_pdf.columns:
print(f'{i}: {col_name}')
i=i+1
print('___________________________')
print('Code Executed Successfully')
Renaming columns in Pandas
After successful code execution, we’ll see the message ...