Trusted answers to developer questions

Data normalization in Python

Get Started With Data Science

Learn the fundamentals of Data Science with this free course. Future-proof your career by adding Data Science skills to your toolkit — or prepare to land a job in AI, Machine Learning, or Data Analysis.

Normalization refers to rescaling real-valued numeric attributes into a 00 to 11 range.

Data normalization is used in machine learning to make model training less sensitive to the scale of features. This allows our model to converge to better weights and, in turn, leads to a more accurate model.

Left: Original Data, Right: Normalized Data
Left: Original Data, Right: Normalized Data

Normalization makes the features more consistent with each other, which allows the model to predict outputs more accurately.

Code

Python provides the preprocessing library, which contains the normalize function to normalize the data. It takes an array in as an input and normalizes its values between 00 and 11. It then returns an output array with the same dimensions as the input.

from sklearn import preprocessing
import numpy as np
a = np.random.random((1, 4))
a = a*20
print("Data = ", a)
# normalize the data attributes
normalized = preprocessing.normalize(a)
print("Normalized Data = ", normalized)

RELATED TAGS

normalization
machine learning
python
Copyright ©2024 Educative, Inc. All rights reserved
Did you find this helpful?