Logistic Regression

Home

Fitting binary regression models is done via maximum likelihood estimation.

We can use the Binomial probability and maximise the log likelihood, or by minimising the negative loglikelihood.

 

 

 

 

To find the gradients, we use backpropagation!

 





 





 

We also need to derive the gradients for each possible activation function.

Activation

Function

Pointwise Derivative

Red is function. Green derivative.

Sigmoid

 

 

ReLU

 

 

 

[Note, since ReLU is positive, can test equality = 0]

Leaky

ReLU

[Can change 0.01]

 

 

 

 

Tanh

 

 

Linear

 

 

 

 

Sigmoid Softsign

[Approximate Sigmoid]

 

 

 

 

 

 

By computing gradients backwards in the chain, we then can derive the gradients for each weight and bias.

 









 

 

(c) Copyright Protected: Daniel Han-Chen 2020

License: All content on this page is for educational and personal purposes only.

Usage of material, concepts, equations, methods, and all intellectual property on any page in this publication

is forbidden for any commercial purpose, be it promotional or revenue generating. I also claim no liability

from any damages caused by my material. Knowledge and methods summarized from various sources like

papers, YouTube videos and other mediums are protected under the original publishers licensing arrangements.

Home