Logistic Regression Example

 · 1 min read
 · Nima Moradi
Table of contents

Logistic Regression

Logistic Regression is a binary classification algorithm that predicts with the given input how much is probable that output is either 1 or 0. Regression analysis is a method that wants to find a relation between one independent variable and one dependent variable.

for instance, we are the owner of a gym club and we want to have weight loss program for are attendants we have gathered data of exercise hour and weight loss in a single month and we want to predict how much time is a need for our newcomer john to loss 5 pounds in a month. here is an approach to the problem gym owner will try to draw a line we yield to least sum of squares to find a pattern to between weight loss and exercise hour as we can we this equation can be modeled using a linear equation

ŷ = b0 + b1x

the problem that we have is gym owner wants to predict that John could achieve more than 5-pound loss with 20 hours a month for probability value should be always between 0 up to 1 whereas weight loss can be any number here we look logistic regression that uses sigmoid function in order to bound linear regression to 0 or 1

sigmoid function diagram

ŷ = σ(b0 + b1x)

we will classifiy liris with logistic regression


from sklearn.linear_model import LogisticRegression

from sklearn.model_selection  import train_test_split
from sklearn.metrics import accuracy_score

For working better with data


import pandas as pd
import numpy as np 

For visualizeing data


import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D

Loading dataset

sklearn has a built-in tool for loading common datasets for educational purposes we used direct loading from CSV


dataset = pd.read_csv('./datasets/Iris.csv')
X = dataset.iloc[:, 0:4 ].values
Y = dataset.iloc[:, 4].values

Splitting dataset into training and testing


X_train, X_test, y_train, y_test = train_test_split(X, Y, 
                                                    test_size=0.25, 
                                                    random_state=0)

define and fit model


clf = LogisticRegression(random_state=0, solver='lbfgs',
                          multi_class='multinomial').fit(X_train, y_train)
print(clf.score(X, Y))

Predicting the Test set results


y_pred = clf.predict(X_test)
print(y_pred)

# accuracy on test set
accuracy_score(y_test, y_pred, normalize=True, sample_weight=None)