# ML for Beginners (Video)

## Section 1: Intro to ML

### Types of Learning

* Predictive -> Supervised
* Descriptive -> Unsupervised (We don't know the outcome)

### Term Comparison

| Machine Learning            | Statistics                     |
| --------------------------- | ------------------------------ |
| network, graphs, algorithms | model                          |
| weights                     | parameters                     |
| learning                    | fitting                        |
| supervised learning         | regression/classification      |
| unsupervised learning       | density estimation, clustering |

### Classification

Inputs -> Algorithm -> Class (Qualitative Ouput)

Prediction Function (y = g(x))

### Regression

Input -> Algorithm -> Number (Quantitative Output)

Function Fitting (y = mx + b)

### Unsupervised Learning

No labeled data.&#x20;

Goal: find regularities in the input

#### Density Estimation (Statisitcs)

The input space is structured; as a result, certain patterns occur more often than others.

#### Clustering (ML)

Method for density estimation. Aim is to find clusters or groupings of inputs.

### Multivariate Calculus

Best mechanism for talking about smooth changes algebraically.

* Optimization Problems (minimize error)
* Probability Measurement (integration)

#### Optimization via Gradient Descent

![](https://3501392451-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-LLZ89zzVxrdnG1RG6CA%2F-LV869g62N6hz4dJGK1c%2F-LV8TVj6W3_ypyshbzo3%2FScreen%20Shot%202019-01-01%20at%2012.53.18%20PM.png?alt=media\&token=77e7ce01-6a54-4f54-9efd-9f2b92b7375c)

#### Probability Calculations

![](https://3501392451-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-LLZ89zzVxrdnG1RG6CA%2F-LV869g62N6hz4dJGK1c%2F-LV8Tk2b66HmKP1rcGoT%2FScreen%20Shot%202019-01-01%20at%2012.54.19%20PM.png?alt=media\&token=10f7b7c8-431b-4468-9f0b-b35c45673d94)

#### Bayesian Inference

![](https://3501392451-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-LLZ89zzVxrdnG1RG6CA%2F-LV869g62N6hz4dJGK1c%2F-LV8Tt5pyXmY-0PIHJzM%2FScreen%20Shot%202019-01-01%20at%2012.54.58%20PM.png?alt=media\&token=a2e7d8b8-8c99-4252-ae4d-f21bad5bce5e)

### Statistics and Probability Theory

We need statistics to...

* deal with uncertain events
* mathematical formulations for probabilities
* estimate probabilities from data

![](https://3501392451-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-LLZ89zzVxrdnG1RG6CA%2F-LV869g62N6hz4dJGK1c%2F-LV8UePinnbt5Lh6FRAF%2FScreen%20Shot%202019-01-01%20at%2012.58.31%20PM.png?alt=media\&token=cafaf98e-47ad-4557-8957-1fad6189d6f1)

![](https://3501392451-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-LLZ89zzVxrdnG1RG6CA%2F-LV869g62N6hz4dJGK1c%2F-LV8Vhj6rt1CVGotx_pv%2FScreen%20Shot%202019-01-01%20at%201.03.06%20PM.png?alt=media\&token=0f5f6bbf-7456-4f45-a4f0-b5aaaf18bc87)

![](https://3501392451-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-LLZ89zzVxrdnG1RG6CA%2F-LV869g62N6hz4dJGK1c%2F-LV8W68PTZgxfXdfaMyo%2FScreen%20Shot%202019-01-01%20at%201.04.51%20PM.png?alt=media\&token=ff8cbc50-c9f6-460c-a50c-4c094ebfb832)

The more data you have the better.

### Linear Algebra

![](https://3501392451-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-LLZ89zzVxrdnG1RG6CA%2F-LV869g62N6hz4dJGK1c%2F-LV8WmzfuJnsPwlKommv%2FScreen%20Shot%202019-01-01%20at%201.07.49%20PM.png?alt=media\&token=1cd4e987-8e3e-4cd1-96fb-cafd61a5893e)

#### Minimum Linear Algebra Knowledge for ML

* Notation
  * Knowing linear algebra notation is essential to understand the algorithm structure referenced in papers, books etc
* Operations
  * Working at the next level of abstraction in vectors and matrices is essential for ML. Learn to apply simple operations like adding, multiplying, inverting, transposing, etc. matrices and vectors.

### Recommended Resources

The Elements of Statistical Learning: Data Mining, Inference, and Prediction, Second Edition (Springer Series in Statistics) *by* [*Trevor Hastie*](https://www.amazon.com/s/ref=dp_byline_sr_ebooks_1?ie=UTF8\&text=Trevor+Hastie\&search-alias=digital-text\&field-author=Trevor+Hastie\&sort=relevancerank) *(Author),* [*Robert Tibshirani*](https://www.amazon.com/Robert-Tibshirani/e/B00H3VSM7W/ref=dp_byline_cont_ebooks_2)  *(Author),* [*Jerome Friedman*](https://www.amazon.com/s/ref=dp_byline_sr_ebooks_3?ie=UTF8\&text=Jerome+Friedman\&search-alias=digital-text\&field-author=Jerome+Friedman\&sort=relevancerank) *(Author)*

Information Theory, Inference and Learning Algorithms *by* [*David J. C. MacKay*](https://www.amazon.com/David-J.-C.-MacKay/e/B001HCVANQ/ref=dp_byline_cont_book_1)  *(Author)*

## Section 2: Supervised Learning (part 1)

> ...inferring a function from labeled training data
>
> Outputs
>
> * Qualitative (Classification)
> * Quantitative (Regression)

### Terminology

* **Generalization** - how well our hypothesis will correctly classify future examples that are not part of the training set
* **Most Specific Hypothesis (S)** - the tightest rectangle that includes all of the positive examples and none of the negative examples
* **Most General Hypothesis (G)** - the largest rectangle that includes all the positive examples and none of the negative examples
* **Doubt** - a case that falls in between the most specific hypothesis (S) and the most general hypothesis (G)

### Linear Methods for Classification

* Linear Models
  * Least Squares
  * Nearest Neighbors (kNN)

#### Math Notation&#x20;

$$X^T = (X\_1, X\_2, ... , X\_n)$$

$$\hat{Y} = \hat{\beta\_0} + \sum^n\_{j=1} X\_j\hat{\beta\_j}$$

$$\hat{Y} = X^T \hat{\beta}$$

Least Squares -> Residual Sum os Squares (RSS)

#### Nearest Neighbor (kNN)

$$\hat{Y}(X) = \frac{1}{k} \sum\_{x\_i \in N\_k(x)} y\_i$$

### Linear Methods for Regression

Goal: learn a numerical function

#### Inputs

* quantitative
* transformations of quantitative inputs (log, square-root, square)
* polynomial representations (basis expansions)
* interactions between variables

### Data Distribution Assumptions

* Inputs $$x$$ are fixed, or non random
* Observations $$y$$ are uncorrelated and have constant variance

### Support Vector Machines (SVM)

(Kernel Machines)

* It is a discriminatnt-based methods
* The weight vector can be written in terms of a subset of the training set (the support vectors)
* Kernel functions can be used to solve nonlinear cases
* Present a convex optimization problem

#### Vectorial Kernels

* polynomials of degree q
* radial-basis functions (use cross validation)
* sigmoidal functions

### Basis Expansions

#### The Big Idea!

Augment or replace the vector of inputs with additional variables, which are transformations of the inputs, and then use linear models in this new space of derived input features.

#### Linear Basis Expansion

$$f(X) = \sum^M\_{m=1} \beta\_m h\_m (X)$$

#### Piecewise Polynomials and Splines

* Divide the domain of X into intervals
* Represent $$f(X)$$ with a seperate basis function in each interval

![The dashed lines are knots](https://3501392451-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-LLZ89zzVxrdnG1RG6CA%2F-LVDrhO4DPoLzMcSqaQM%2F-LVICgxYqeGSQmr1zM9g%2FScreen%20Shot%202019-01-03%20at%2010.16.06%20AM.png?alt=media\&token=47ad7f76-380c-49d4-89b6-548de16560ac)

### Model Selection Procedures

![The data by itself I not sufficient!](https://3501392451-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-LLZ89zzVxrdnG1RG6CA%2F-LVDrhO4DPoLzMcSqaQM%2F-LVIDbr0ruaLSEimqMns%2FScreen%20Shot%202019-01-03%20at%2010.20.12%20AM.png?alt=media\&token=9ce7e5a2-7e70-4254-8f1d-6732159ae650)

#### Inductive Bias

* Assuming linear function
* Minimizing Squared Error

Chossing the right bias is called **model selection**

![](https://3501392451-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-LLZ89zzVxrdnG1RG6CA%2F-LVDrhO4DPoLzMcSqaQM%2F-LVIFI895M12eh-oJpnr%2FScreen%20Shot%202019-01-03%20at%2010.27.33%20AM.png?alt=media\&token=35b51f9d-eef4-464f-bec1-bf24d0edcc3b)

![](https://3501392451-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-LLZ89zzVxrdnG1RG6CA%2F-LVDrhO4DPoLzMcSqaQM%2F-LVIGMGt8LB58tFT9mF6%2FScreen%20Shot%202019-01-03%20at%2010.32.11%20AM.png?alt=media\&token=d5a1ae28-f185-40cb-867c-f0f17ee898a1)
