How to implement logistic regression using the Scikit learn kit

Logistic regression is a supervised classification algorithm. We’ve discussed what logistic regression is here. Now we will implement logistic regression using the Scikit learn toolkit.

We’ll use the wine dataset to train on the logistic regression model from scikit learn. We split the data into train and test (80-20 split) to make sure the classification algorithm is able to generalize well to unseen data.

Importing the necessary libraries

We import the dataset from sklearn’s provided dataset. We will use the sklearn train test split function to split the data into train and test samples. For evaluation, we use sklearn’s provided confusion matrix and accuracy functions. Finally, we import the LogisticRegression from the sklearn library, as shown below:

import numpy as np
from sklearn.datasets import load_wine 
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression 
from sklearn.metrics import confusion_matrix, accuracy_score

Train test split

We split the data into test and train using the sklearn library function imported above. We use an 80-20 split, where 80% of the data is train and 20% is test. x-train and y_train contain the training data and labels respectively, while x_test and y_test contain the testing data and labels.

x_train, x_test, y_train, y_test = train_test_split(dataset.data, dataset.target, test_size=0.20, random_state=15)

Evaluating scores

We use the accuracy function and predicted labels to find the accuracy of the model. We multiply by 100 to get accuracy out of 100.

Similarly, we use the predicted labels to find the confusion matrix.

accuracy = accuracy_score(y_test,y_pred)*100

confusion_mat = confusion_matrix(y_test,y_pred)

#Importing the necessary libraries 
import numpy as np
from sklearn.datasets import load_wine 
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression 
from sklearn.metrics import confusion_matrix, accuracy_score 
# Importing the dataset from the sklearn library into a local variable called dataset
dataset = load_wine()
# Splitting the data test into train 80% and test 20%.
# x_train, y_train are training data and labels respectively 
# x_test, y_test are testing data and labels respectively 
x_train, x_test, y_train, y_test = train_test_split(dataset.data, dataset.target, test_size=0.20, random_state=15)
# Making the logistic regression model
logistic_model = LogisticRegression()
# Training the model on the training data and labels
logistic_model.fit(x_train, y_train)
# Using the model to predict the labels of the test data
y_pred = logistic_model.predict(x_test)
# Evaluating the accuracy of the model using the sklearn functions
accuracy = accuracy_score(y_test,y_pred)*100
confusion_mat = confusion_matrix(y_test,y_pred)
# Printing the results
print("Accuracy is",accuracy)
print("Confusion Matrix")
print(confusion_mat)

How to implement logistic regression using the Scikit learn kit

Importing the necessary libraries

Loading the dataset

Train test split

Making the logistic regression model

Training the model

Predicting the labels of the test data

Evaluating scores

Printing the results

Code playground