Numerical Variables Transformation
Numerical Variables Transformation refers to applying operations on the Numerical Columns to have better performance of Machine Learning models. You will learn more here.
We'll cover the following
Numerical Variables Transformation
In the Lesson of Probability Distributions, while discussing Gaussian Distributions, we discussed that algorithms in Machine Learning like Linear Regression or Logistic Regression assume the Features’ underlying distribution to be Gaussian. If not, the model might perform badly. If the distribution is not Gaussian, we can apply Transformations to make it Gaussian. Machine Learning models that assume the underlying distribution of the variables to be Gaussian are:
 Linear Regression
 Logistic Regression
 Linear Discriminant Analysis
 Naive Bayes
We can apply the following transformations on the dataset’s individual features after analyzing them. Transformations can help us to achieve good results by making the underlying features more Gaussianlike.

Logarithm Transformation: This transformation is used on the features that have positive values. This logarithm is the Natural Logarithm.

Reciprocal Transformation ($\frac{1}{x}$ where $x$ is one of the values of the feature): This transformation can be applied to negative values and is not applied to the value $0$.

Square Root or Cube Root Transformation: This transformation comes under the category of Power Transformations and it involves taking the power $x^{\frac{1}{2}}$ or $x^{\frac{1}{3}}$ where $x$ is the individual values of a feature.

Exponential or Power Transformations: It involves taking the power of an individual value of a feature (i.e $x^\lambda$), where $\lambda$ is any number. The goal is to try different values of $\lambda$, and see which works best for the case at hand.

BoxCox Transform : BoxCox Transform performs transformations under the different values of theparameter $\lambda$. The boxcox() SciPy function implements the BoxCox transformation. It takes an argument, called lambda, that controls the type of transform to perform.
Below are some common values for lambda:
 $\lambda$ = 1 is a reciprocal transform.
 $\lambda$ = 0.5 is a reciprocal square root transform.
 $\lambda$ = 0.0 is a log transform.
 $\lambda$ = 0.5 is a square root transform.
 $\lambda$ = 1.0 is no transform.
 if $\lambda$ is not specified then an optimal value is chosen by the function based on the underlying distribution.
Get handson with 1200+ tech skills courses.