Extensions To Linear Models
Last updated
Last updated
In statistics, an interaction is a particular property of three or more variables, where two or more variables interact in a non-additive manner when affecting a third variable. In other words, the two variables interact to have an effect that is more (or less) than the sum of their parts.
Let's formalize this:
Under-fitting happens when a model cannot model the training data, nor can it generalize to new data. happens when a model cannot model the training data, nor can it generalize to new data.
Our simple linear regression model fitter earlier was an under-fitted model.
Overfitting happens when a model models the training data too well. In fact, so well that it is not generalizable.
Lasso and Ridge are two commonly used so-called regularization techniques. Regularization is a general term used when one tries to battle overfitting.
Ridge regression is often also referred to as L2 Norm Regularization
Lasso regression is often also referred to as L1 Norm Regularization
AIC(model) = -2 * log-likelihood(model) + 2 * (length of the parameter space)
BIC(model) = -2 * log-likelihood(model) + log(number of observations) * (length of the parameter space)
Performing feature selection: comparing models with only a few variables and more variables, computing the AIC/BIC and select the features that generated the lowest AIC or BIC
Similarly, selecting or not selecting interactions/polynomial features depending on whether or not the AIC/BIC decreases when adding them in
Computing the AIC and BIC for several values of the regularization parameter in Ridge/Lasso models and selecting the best regularization parameter.
Many more!