# Wk 1

## Regression and classification

$$y\_i \in \mathbb{R}$$ -- regression task

* salary prediction
* movie rating prediction

$$y\_i$$ belongs to a finite set -- classification task

* object recognition
* topic classification

### Linear model for regression

$$a(x) = b + w\_1x\_1 + w\_2x\_2 + \dots + w\_dx\_d$$

* $$w\_1, \dots , w\_d$$-- coefficients (weights)
* $$b$$ -- bias
* $$d$$ + 1 parameters
* to make it simple: there is always a constant feature

Vector notation:

$$
a(x) = w^T x
$$

For a sample $$X$$:

$$
a(X) = Xw \\

X = \begin{pmatrix}x\_{11} \cdots x\_{1d} \ \vdots \ddots \vdots \ x\_{n1} \cdots x\_{nd}\end{pmatrix}
$$

### Loss function

How to measure model quality?

$$
\text{Mean squared error:} \\
L(w) = \frac{1}{n}\sum^n\_{i=1}(w^Tx\_i - y\_i)^2 \\
\= \frac{1}{n}\Vert Xw - y\Vert^2
$$

### Training a model

Fitting a model to training data:

$$
L(w) = \frac{1}{n} \Vert Xw - y \Vert^2 \rightarrow \underset{w}min
$$

Exact solution:

$$
w = (X^TX)^{-1} X^Ty
$$

**But inverting a matrix is hard for high-dimensional data!**

### Gradient Descent

![](https://3501392451-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-LLZ89zzVxrdnG1RG6CA%2F-LptsH5h-j8sW3lrh8PZ%2F-LptsPK4YK72RdznITSs%2FScreen%20Shot%202019-09-28%20at%202.57.36%20PM.png?alt=media\&token=b9b8388c-d9b1-4d6b-85e5-344326c16626)

![](https://3501392451-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-LLZ89zzVxrdnG1RG6CA%2F-LptsH5h-j8sW3lrh8PZ%2F-LptsXrnSbkaLCymRKov%2FScreen%20Shot%202019-09-28%20at%202.58.37%20PM.png?alt=media\&token=50766c7c-5002-4bcc-ab0c-a59794f2c2b5)

![](https://3501392451-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-LLZ89zzVxrdnG1RG6CA%2F-LptsH5h-j8sW3lrh8PZ%2F-LpttDLDseZJbjYpCAJQ%2FScreen%20Shot%202019-09-28%20at%203.01.32%20PM.png?alt=media\&token=ec04d69d-40a6-4dbb-8725-13f386706665)
