Search⌘ K
AI Features

Building the Model

Explore how to create k-means clustering models using PyCaret's create_model function. Learn to evaluate model performance with silhouette scores, use the elbow method to identify the optimal cluster count, and visualize clusters with PCA plots. Understand how to save and assign models for deployment and further analysis.

Creating a model

The create_model() function lets us easily create and evaluate the clustering model of our preference such as the k-means algorithm. This function creates 44 clusters by default. We can set the num_clusters parameter to 33 because this is the correct number. Instead of doing that, however, we’ll follow an approach that generalizes for real-world datasets where the cluster number is typically unknown. After executing the function, we print several performance metrics such as silhouette, Calinski-Harabasz, and Davies-Bouldin. We’ll focus on the silhouette coefficient defined in the following equation.

s(i)=b(i)a(i)max{a(i),b(i)}s(i)=\frac{b(i)-a(i)}{\max \{a(i), b(i)\}} ...