Gradient Descent

Regression boils down to four operations:

  1. Calculate the hypothesis h = X * theta

  2. Calculate the loss = h - y and maybe the squared cost (loss^2)/2m

  3. Calculate the gradient = X' * loss / m

  4. Update the parameters theta = theta - alpha * gradient

import numpy as np

# m denotes the number of examples here, not the number of 
# features
def gradientDescent(x, y, theta, alpha, m, numIterations):
    xTrans = x.transpose()
    
    for i in range(0, numIterations):
        hypothesis = np.dot(x, theta)
        loss = hypothesis - y
        
        # avg cost per example (the 2 in 2*m doesn't really matter here.
        # But to be consistent with the gradient, I include it)
        cost = np.sum(loss ** 2) / (2 * m)
        print("Iteration %d | Cost: %f" % (i, cost))
        
        # avg gradient per example
        gradient = np.dot(xTrans, loss) / m
        
        # update
        theta = theta - alpha * gradient
        
    return theta

Linear Regression Setup

y^i=Xi1w1+Xi2w2+Xi3w3+...+Xinwn\hat{y}_i = X_{i1} \bullet w_1 + X_{i2} \bullet w_2 + X_{i3} \bullet w_3 + ... + X_{in} \bullet w_n

S(x)=11+e(x)=exex+1\large S(x) = \frac{1}{1+e^{(-x)}} = \frac{e^x}{e^x + 1}

import numpy as np 

def sigmoid(x):
    x = np.array(x)
    return 1 / (1 + np.e ** -x)

Last updated