Error: ‘PCA’ object has no attribute ‘explained_variance_ratio_’

The explained_variance_ratio_ attribute is a valid attribute in scikit-learn’s (or sklearn) Principal Component Analysis (PCA) module according to the release history of sklearn. It represents the ratio of variance explained by each of the selected principal components. Sometimes, using this attribute in our code can lead to a nonexisting error with its possible causes listed below.

Possible causes

This sort of error can occur due to the following reasons:

  • Outdated sklearn version: There is a possibility that sklearn is outdated for our code.

  • PCA not instantiated: The PCA object is not created at the start of the code. It is not trained at the time of using the explained_variance_ratio_ attribute.

Cause 1: Outdated sklearn version

When we implement PCA with code, we sometimes overlook the version that is being used of sklearn for our projects. Attribute names vary with each version of a particular module.

Solution

To solve this, we must use the up-to-date sklearn version that supports this attribute. The command to ensure this is given below:

pip install --upgrade scikit-learn

Let's take a look at what the output of this command will look like in the terminal.

Terminal 1
Terminal
Loading...

By running this terminal, we should be able to see that the latest version of sklearn along with its depedencies, which are installed successfully.

Cause 2: PCA isn’t working

When we fit our initialized data to the PCA model, there could be a possibility where we forget to create a PCA object before being able to access the explained_variance_ratio_ attribute as shown below:

from sklearn.decomposition import PCA
# Load or define the data as X
X = [[1,2,3], [4,5,6]]
# Create a PCA object
pca = PCA(n_components=2)
# Access the explained variance ratios
explained_var_ratio = pca.explained_variance_ratio_
# Print or use the explained variance ratios
print("Explained Variance Ratios:", explained_var_ratio)

Solution

We must ensure that an object of PCA is created at the start of our code so that the explained_variance_ratio_ attribute can be used properly. Alongside this, we must use the fit method to train our 2D data as shown in the codes above:

Here’s an example code to understand this further:

from sklearn.decomposition import PCA
# Load or define the data as X
X = [[1,2,3], [4,5,6]]
# Create a PCA object
pca = PCA(n_components=2)
# Fit the PCA model to the data
pca.fit(X)
# Access the explained variance ratios
explained_var_ratio = pca.explained_variance_ratio_
# Print or use the explained variance ratios
print("Explained Variance Ratios:", explained_var_ratio)

We can clearly see that a PCA object is initialized at line 7 with n_components being equal to 2, meaning that two principal components will be retained after dimensionality reduction.

Conclusion

By adhering to the suggested solutions presented above, we'll be able to identify and resolve the error of the explained_variance_ratio_ attribute not existing. It is a valuable tool for understanding the proportion of total variance retained by each principal component whilst performing PCA. By computing these ratios, we can make informed decisions about the number of components to retain when reducing the dimensionality of our data.

Free Resources

Copyright ©2024 Educative, Inc. All rights reserved