Wk 1

Regression and classification

yiRy_i \in \mathbb{R} -- regression task

  • salary prediction

  • movie rating prediction

yiy_i belongs to a finite set -- classification task

  • object recognition

  • topic classification

Linear model for regression

a(x)=b+w1x1+w2x2++wdxda(x) = b + w_1x_1 + w_2x_2 + \dots + w_dx_d

  • w1,,wdw_1, \dots , w_d-- coefficients (weights)

  • bb -- bias

  • dd + 1 parameters

  • to make it simple: there is always a constant feature

Vector notation:

a(x)=wTxa(x) = w^T x

For a sample XX:

a(X)=XwX=(x11x1dxn1xnd)a(X) = Xw \\ X = \begin{pmatrix}x_{11} \cdots x_{1d} \\ \vdots \ddots \vdots \\ x_{n1} \cdots x_{nd}\end{pmatrix}

Loss function

How to measure model quality?

Mean squared error:L(w)=1ni=1n(wTxiyi)2=1nXwy2\text{Mean squared error:} \\ L(w) = \frac{1}{n}\sum^n_{i=1}(w^Tx_i - y_i)^2 \\ = \frac{1}{n}\Vert Xw - y\Vert^2

Training a model

Fitting a model to training data:

L(w)=1nXwy2mwinL(w) = \frac{1}{n} \Vert Xw - y \Vert^2 \rightarrow \underset{w}min

Exact solution:

w=(XTX)1XTyw = (X^TX)^{-1} X^Ty

But inverting a matrix is hard for high-dimensional data!

Gradient Descent

Last updated