PyCaret is an open source machine learning library which we can use for all the different machine learning tasks. These range from initial steps like data preparation to ending steps like model deployment. All of these steps are performed with just a single line of code. PyCaret is also used as a wrapper around many machine learning libraries and frameworks.
The general workflow of a machine learning project that uses PyCaret will be as follows:
The first step in any machine learning task is to import or load the dataset. For this, Pycaret has the get_data()
function, which is used to load the dataset. The data is loaded into a pandas data frame.
After loading the dataset, we can visualize and analyze the data. PyCaret supports different forms for data analysis and visualization, which are:
After an exploratory data analysis, we set up the PyCaret environment. For this purpose, PyCaret has the setup()
fuction.
This process is divided into various steps:
In the first step, we use the create_model()
function to create the model.
In the second step, we use the tune_model()
function to tune the model.
To predict from the model, we use the predict_model()
function.
To plot the model, we use the plot_model()
function.
To finalize and save the model, we use the finalize_model()
and save_model()
functions, respectively.
All these steps are performed in the same order as they are written in over here.